 Live from Washington, D.C., it's theCUBE. Covering.conf 2017, brought to you by Splunk. Well, welcome back to Donkhoff 2017. Here we are at Splunk's annual get-together with Dave Vellante, I'm John Walls. We are live in the Warner Washington Convention Center in beautiful Washington, D.C. I say that, proud to be a native. Actually, raised here, live here. Fly the flag here. Wow, this is my place, Dave. Listen, I love this city. I love coming down here, lots to do. My son's down here, so. But if we weren't here, where should we be? Maybe on the deck of a Carnival Cruise Line ship right now? That would be good. I would like that. I would love to have the cruise on the deck of a Carnival. Maybe Ruel Waite can swing that. What do you think? Ruel Waite joins us. He is a manager of delivery and support for Carnival, and you got room for two on the next ship out of Miami? Listen, man, for you guys, anything. I love that. Yeah, yeah, yeah, yeah. You're hired. I can make it happen. Outstanding. All right, Ruel, thanks for being here with us. No problem. I'm glad to have you, and here at the show as well. All right, so let's talk about first off Splunk. Yeah. What are you doing? First off, well, let's back up in terms of what you do. Yeah. Your core responsibilities, and then we'll get into the Splunk story after that. Yeah. I manage the support operation for our e-commerce platform, as well as for the guest-facing shipboard applications. So the e-commerce platform is where you go and purchase your cabin on the web. You would also be able to purchase your shore excursions, your spa treatments as well, or we have an e-retail site where if you have a friend who's sailing, you can buy a bottle of champagne and have it in their room for when they get there. So all those purchasing functionalities we support on the e-commerce platform. And then the guest-facing application shipboard, we're talking about the mobile application where guests chat and interact with each other or plan their day. We're talking about the pixels application where guests purchase their photos that they take throughout their cruise. And there's some facial recognition stuff there as well. And the ITV that's in your room. So we have separate many different sort of applications that fit under that portfolio. Let's talk about the data. There's a lot of data that you just created. Yeah. What's the data pipeline look like? Where does Splunk fit? Yeah, so we Splunk as much as we can. And we're continuing to build that as we go. Our application logs are Splunk. Everything we produce from the application. Also our performance metrics from our servers and our data and our network and all of those systems. We Splunk that because that's critical for us to triage issues that are occurring. Because our operation is about monitoring what's happening. It's about resolving issues as quickly as possible. And it's about communicating to our business. So those three things are, you know, data is central to all of that, right? So we need to get as much as we can and we need to be able to get insights into it. Can you talk about the, you know, where you started? You mentioned off camera about four years ago. Yeah. And how you've been able to inject automation into your processes and just take us through your journey. Yeah, so, you know, we started a few years ago with Splunk and it was primarily a triage tool for us, right? So we, an incident would occur. We'd try to get in and look at some logs, figure out what's going on. And as we've evolved, it's become more of a proactive alerting tool for us. It's become a communication tool, a collaborative tool for us. You know, because, you know, we have, we leverage things like the ITSI, right? That allows us to understand the baseline behavior of our system, right? Once you baseline that, then we can understand the spikes. We can understand when things are changing and that allows us to react and quickly identify things, defects in our system, things that are occurring and resolve them, right? So once we, you know, once we kind of got our legs around, okay, we get how to use Splunk to find stuff, now let's figure out how to get Splunk to tell us stuff. Okay. And then, you know, and now once Splunk is telling us stuff, let's figure out how we tell the business that stuff. So that's kind of how we, the journey we've had. And Splunk's in that thread the whole way. The whole way. So ultimately then, I mean, what are, right now, what are you putting into practice that maybe you didn't have available? Yeah, sure. Two, three years ago. Yeah, sure. So one of the challenges we had was, but a typical e-commerce site, you have several layers of the application, right? You have your web servers. You have, you know, you have caching infrastructure. You have a database server. And you know, we have a mainframe reservation system as well. So there are several teams involved with supporting all those different platforms. Now when we have an incident, you know, it was sometimes challenging to, you know, you get somebody on the phone and you're like, hey, what are you seeing over there on the mainframe side? Well, I see this error occurring. Oh, on the database side, they're telling you, okay, we're seeing some sort of timeouts here, but we're not sure if it's related to the same thing you're talking about. And we didn't have a way to tie it together. But by using transactions, what we decided to do was, we decided to log the session ID, the web server session ID across all our layers, right? And push that through and that allows us to tie those transactions together across those layers. And now when we have an incident, we're able to, when we're talking to the mainframe, we're saying, hey, guys, hey, go look at this. And he said, oh, here's what I'm seeing. We can isolate it. And we can isolate it and we can pull it together. And it's really helpful. So will you get to the point, are you trying to get to the point where you can automate the remediation? Or is that something you don't want to do because you want humans involved? You know, automation is good. And whatever we can automate, we try to do that. At this point, we're not automating the resolution through Splunk at this time. But what we are doing is we are providing the on-call or the engineers that are responding with as much information as we can in order to have them quickly flip that switch. So if we have an alert that we know, hey, this issue requires a recycle of an application pool or some kind of other action like that, we put that in our Splunk alert. And we say, hey, we're seeing this issue occur. That email and that text message that goes out actually tells the engineer that these are the suggested actions that you can take in order to quickly resolve this issue. Bro, what are you hearing from the business side? What are the business drivers and how is that affecting what you're doing in IT generally and specifically with data and Splunk? Okay, so from the business side, we're looking at most bookings, right? Is one of the major metrics that we look at. And our guest experience, right? So on the web, that means a site needs to be available, it needs to perform, and it needs to work, right? And so what we really are trying to do with Splunk is understand those issues that are impacting our guests on the booking side. What that means is we need to know how well we are converting. And if we are looking at homepage performance and we can now tell, hey, if our homepage loads in five seconds versus three seconds, these are how many fewer people make it to our payment page, which is huge for us, right? So that's something that we really try to hone in on. And it really helps us to collaborate with the business and understand, really, what is the revenue impact of these IT metrics that we're spitting out? But there could be other factors involved in that too, other variables, right? There are. You can't just say, you know, this is, but if you have enough of a track record there over a couple of years, say, okay, five second loan means this, we get a 30% conversion rate, we get three seconds, man, we got them a hell of a low, and now we have a 50% whatever. Yeah, but that is where what I'm excited about at the conference is the machine learning capabilities that we've been hearing about, right? Because that will allow us to then model all those different factors that go into when someone goes from the homepage to payment. You're totally right. There are several things that go into that. And we want to be able to model, hey, on a normal day, you know, here's our guest behavior. When we have a sale, you know, how do our guests behave differently? Or when, you know, on a Monday night at 8 p.m., what is the behavioral trend? So it's all important to us, and getting the data behind it, and being able to model that is going to be really key for us. Connect the dots for me on how you'll use machine learning and how will that affect the business? So you'll make different offers at different times? No, so what I mean is, if I understand that, if I understand how guests behave, I will know if I'm having an issue on the side. If there's something happening that's impacting their ability to book. Because sometimes you do a release, you do your quality control, and then you go home, everything looks good. And sometimes hours later, sometimes days later, unfortunately, something pops up that you introduce during that release. And understanding what that baseline is, right? So what Splunk has allowed us to do is say, okay, here's what normal behavior is. And we're trying to grow this more, but we've been using ITSI to say, here's what that behavior really is, right? Based on what we know are the metrics around bookings, right? Here's what that behavior is. And we do a release, and we see a spike, a change, and now we're able to say, wait a minute. We never saw this error before. This error never existed in our system at any point. That was definitely something that was introduced right here in this release. We need to go ahead and resolve this as well. And sometimes you get some false positives there. You get, if your development team is doing, it's changed the way they log a little bit, you might get a spike, but that's cool because you get to go in immediately and figure out what those changes are. And you get a comfort level that you kind of understand how your system works. Let me ask you another question. I mean, you had some experience with Splunk, obviously a few years working with them. What in your mind is on there to do list? What do you want to see out of them? Oh man. Doug, if I'm Doug, he's a hit. Tell me, where should I go? What should I do? Any gripes? Give me the good, the bad, and the ugly. No. For me, it's performance, performance, performance. I want to see my queries run as quickly as possible. I want to see things fast. I want to click a button and it happens right away. No, obviously that's not realistic, but I like some of the things that Splunk are doing. You look at the new metrics index that they've been talking about the last two days. So they have now isolated your time series data and they're able to optimize the searches on time series data separate from your application logs. So your CPU, your memory consumption, that data is not the same as logging an error or logging that a booking was created or something like that. Those are kind of two different things. So they have kind of decoupled that and they're saying, okay, anything that's time series, I'm going to put it over here and I'm going to optimize that query and then you can handle your other logs separately. But the additional benefit of that is then you can take your time series and you can look at a CPU spike and then you can take your event data and overlay it on top. And then you can see, wait a minute, this event is what caused that spike. So that's really cool. I think they call that mStats, is that right? Yes, mStats, yes. How about the stuff that you saw this week in the keynotes, particularly today was the product stuff, a lot of security obviously. Anything that you've seen here at the show that excites you that you really said, I've got to have that, I've got to learn more. Yeah, so the ITSI event analytics is really, seems like something's going to be cool for us. So as I've said before, we utilize ITSI internally. So we put together our glass tables that shows us, okay, here are all the different components and the hierarchy of things and when this goes red, it affects these other layers and it's really cool. But what they've added in is the ability to click a button and drill in to those components and then you have a view of, hey, here are the events associated with that. That's really cool because now you're triaging in one place, now you get to the problem really quick and you can go directly into your Splunk queries. It really allows, what we're looking for is to resolve issues as quickly as possible. If you describe it, if I understand this correctly, you can visualize the dependencies. Yes, exactly. And you can take remedial action or identify or inform the business, what to expect, to be much more proactive. That's what people are talking about. Yeah, and we found that one of the surprising things we've found with Splunk is that our business are users of Splunk as well. So it's always an IT tool, it's something that only the geeks are going to look at. And then all of a sudden you present a dashboard to a business user and they go, ah, that's pretty, right? And then all of a sudden they want it more than you do. So that's what makes it great, right? Because you can present the data however you want and you can put it in a way that different audiences can consume. And so it becomes a platform that goes across the organization, which is really, really cool. But sure, bottom line's all speed, right? Yeah, yeah, yeah. I mean, take care of my problems faster, get my tests more faster, deliver faster. Come on, Splunk. Come on, let's go. We want to go. Let's do it again faster, right? Yeah, get more, you know, get more sleep, get more sleep. Well, thanks for being with us. We appreciate that. And, you know, we'll talk about the crews. I mean, I know Leonard Nelson, our producer over here has already said, book him for a massage, the presidential suite. He wants one at night, and then the champagne buffet. It's done. It's fast internet, though. Yeah, it's fast internet, yeah. All right, we're simple people. We don't need all that, but we'll talk later. All right, man, appreciate it. Thank you. Thank you for being with us. Well, wait, joining us from Carnival, back with more from Splunk.com, 2015, 2017. 2015, where did that come from? 2017, been a long day.