 ServiceNow Knowledge 4T is sponsored by ServiceNow. Here are your hosts, Dave Vellante and Jeff Frick. Good morning everybody, welcome to San Francisco. This is Dave Vellante with Jeff Frick. We're here live at the Moscone Center. This is day three of ServiceNow Knowledge. Last night was the big customer event. It was at Fort Mason. Taste of San Francisco, they had little food courts all over the Fort Mason venue. It was fantastic, music was blasting. Had to be, you know, most of the 6,600 customers were there enjoying just an awesome event. Big, a lot of excitement here. Dan McGee is with us, just coming off the keynote. He's the Senior Vice President of Engineering and Cloud Operations at ServiceNow. Dan, welcome to theCUBE, how do you feel? Feel great, yeah. Great job, you had a packed house given that you had the big party last night. Yeah, we told a lot of people what we were going to talk about and it seemed like there was a lot of interest in it, so I was happy to see all of that. You must be a little nervous, right? I gave you the Thursday morning, but yeah. Surprisingly not, you know, I was anxious to tell the story and so I had fun. Yeah, it was good. So the cloud is a scary place, I'm told. Yeah, well, it can be, it can be. Well, that's kind of how your presentation started, right? You had this, you know, video going on and I really liked sort of the way you framed it. But so why is the cloud a scary place for a lot of people and let's get into what you guys are doing about it. Yeah, well I think our first point was that, you know, not all clouds are the same. People have different sort of design points. There's, you know, the consumer space that is targeting for sort of one thing, which is getting eyeballs to their location and then there's the departmental clouds like Salesforce and others that are fundamentally focused on just a piece of the enterprise and then there's the enterprise class clouds where having that stuff up and working all the time is the most important. So that's kind of where it all starts. And then the second thing is just fundamentally truth and advertising, right? We really believe strongly in reflecting what the real availability is for customers real time, not some sort of fictitious number that we all tell ourselves about how things are working. We really want our numbers to reflect what customers are experiencing. Yeah, we're going to talk about that some more. So okay, so the consumer cloud, a lot of it's free. So if it's down, you know, and or a lot of even, you know, sort of low-end clouds that you pay for, a lot of times the SLA is, hey, we'll do our best and if we don't send us an email, we might get back to you sooner or later. You made a point that the departmental clouds, you know, it's okay if your CRM is down for a little bit on the weekend. Because you said the sales guys are out golfing or whatever they're doing, but they're not like hardcore, you know, banging away, but your customers are. That's right. Yeah, you know, when I first started working at ServiceNow, you know, that was a very early lesson. Folks are using our system in mission critical applications and they can't afford the system to be down any time of the day. A lot of folks or a lot of cloud providers try to hide behind the, well, we'll take the system down at 2 a.m. on Saturday. Well, you know, where it's 2 a.m. on Saturday in New York, it's not 2 a.m. on Saturday in, you know, Sydney, Australia. And so that just doesn't work for enterprise-class providers. You said you have one care provider who actually utilizes ServiceNow at the bedside. Yeah, they do. In fact, that's where I learned that lesson most clearly. You know, when you're down, we have to find a different way to provide care to our patients. Well, Dan, is the expected uptime for a cloud different than people expect from their own enterprise internal IT? I don't think so. They have external, you know, internal SLAs that they have to maintain the number of nines that you're expected to do. You know, I don't think so. You know, I think the end users in the enterprise, fundamentally, they just get really annoyed when they want to use it and they can't use it. It's really that simple. So it's black and white for them. If the system's down, they're angry. If it's up, everything's going fine. And so, you know, I don't think there's, I don't think you can get good enough in terms of availability for us that are all trying to do our jobs. For example, right now, if we're trying to do this interview and the lights went out or your computer went out, you know, that would be arguably a mission critical situation. Right, right. It wouldn't be very tolerant. It's just interesting in terms of the changing expectations because I've used Salesforce in the past life and you know, you get the little notice every, whatever, you know, we're going to be down from X to Y and you don't really think much about it because it always comes up at some frequency. So that's a very different kind of expectation setting than, you know, being mission critical or on the finite side. And furthermore, you know, they've set the expectation that when they give you that notice, it's not negotiable because, and they can't be negotiable for them because they have a multi-tenant architecture. They have to take hundreds of customers out at the same time. They can't be having individual conversations because there's no way you're going to get 500 people to agree on when you can take knowledge. ServiceNow's architecture, which is single instance, we actually can have that conversation with customers. If we say, you know, we're going to do some sort of planned maintenance and it's going to create a glitch for you and they're going to say, hey, not this week, can we do it next Monday? We can actually accommodate that request because of our architecture. Okay, so now let me play a little skeptic here. So somebody might say, all right, well, that's fine, Dan. I buy that, but you know, you're comparing, we're going to do some comparisons in a little second, but you're talking about Amazon's massive scale, hundreds of thousands of customers. You guys only have 2,000 customers. I mean, you know, we're talking about a much smaller scale. Isn't it easier for you to sort of deliver those services? Yeah, well, you know, the reality of it is, you know, we don't have a button on our website that says, you know, if you don't like the fact that we're going to do a maintenance window, you know, click that and we'll automatically change it. You know, customers have to make the effort to call us and contact us. It is effort, but it's an effort that we're actually able to do. That's the big thing. We can do it. Whereas other folks out there that have these multi-tenant architecture, they don't even have the choice. Even if they wanted to do it, they can't. Yeah, now, so let's talk about scale. Sort of tongue in cheek there, saying to you guys a couple thousand customers, but you have 12 million users, right? So a very large number of average, over 5,000 users per customer. So that's your scale, that's where you're at today and you're growing very rapidly. Talk about that a little bit. Yeah, so I think that's again just sort of a demonstration of once this product gets deployed at a particular company, it grows like a weed. It is so easy to use and so easy for people to put applications together. The penetration in an account grows like wildfire and I think that's just a testament to how easy it is to generate applications, get them up live and get them going. So let's talk some more data points. You've got 16 data centers and they're replicated, right? So they're proximate and you do that because of your architectural approach of not being multi-tenant is one reason, right? You're able to fail over very quickly. Actually that's more driven by customers' desire to have their data in their region. So the data sovereignty issue, for example, the best example is the Swiss banks, right? Swiss banks, there's a Swiss finance law where the data can't leave the country and so we have a replica pair of data centers in Zurich and in Geneva and that satisfies their requirement. That's really more the drive for having replica data centers in the region. Now isn't that a similar law, for instance, I have to ask you this, is there a similar law in Germany, but you can, how do you cover Germany? Do you just cover it with some other EU country and that suffices or? Yeah, to date the German folks are served out of our Amsterdam and London data centers and so far that's working fine. World's always changing but the point of the matter is we have an architecture that treats all these data centers the same. They're managed with one network operation center, they're managed with the one incident process I described today, so if we do decide we need to put a couple of data centers in Germany and it's financially viable for us to do so, we can do that. It's not a re-architecture, it's not a two-year exercise, it's not a promise we're gonna get it done in a year and then get it done in five. It's something we can do. And you've got a pretty large CMDB. We do. Yeah, so our CMDB like most of our customers populates and holds all the data associated with our cloud. We can map directly from a customer instance all the way through all the various networking components and hardware components all the way down to the disk drives if we need to so we know what is dependent on what. If a customer has an issue with their service we can very quickly figure out where the problem is. Likewise on the other extreme if we have a problem at the very lowest layer we can quickly figure out what service levels are being impacted and get after it very quickly. All of our time is spent actually remediating the issue or actually doing a failover. We spend very little time sort of hunting for what the problem is. Okay, a couple of other quick stats. 24,000 instances at no 14 just to support this event and you did it with essentially one person. You said there's really two people but somebody had a lead so it was really one FTE. 2.5 million individual entries in this CMDB. So talking about 3.6 billion transactions per month. Yeah, yeah, so big numbers. I think there's demonstrating a couple of things. On the CMDB side it's demonstrating not only how well articulated our infrastructure is but it's showing customers how articulated their infrastructure can be. They can run their operation with ServiceNow just like we run our operation with ServiceNow and make life a lot easier. On the scale side it shows transaction rate is really the amazing thing about ServiceNow and our ability to scale transactions as we talked about in the keynote. We're not really that storage centric of an application. People are not storing gobs and gobs and gobs of petabytes in our stuff but they are doing sort of 3x the transaction rate per customer of anybody else out there and our ability to actually deliver that without customers even having to worry about it is one of the great things about ServiceNow. All right, so we're a little time on time so I want to get into really the heart of what I wanted to talk about today. Anybody who's followed Wikibon and Silicon Angle over the years knows that we've been oftentimes talking about, you know when somebody talks about five nines or six nines, it's irrelevant. What matters is what the user sees and we're going to talk about that a little bit. So you put up a chart, let's start. What is a nine? Everybody talk about five nines, four nines, two nines. What's a nine? Yeah, so 99.99% availability means basically you have five minutes of downtime a month. So actually I think it's five nines is that. So 99.99 is five minutes of downtime a month. Five minutes a month. Doesn't sound like much. Doesn't sound like much unless you're doing an interview right here and your system goes down for five minutes. That's a big problem. Yeah, so it matters, it always does. There's 100% is what people expect. Okay, and then you shared some data. I apologize not having this chart for our audience, but I'll sort of read it off. I took a snapshot, or actually somebody sent me a snapshot. So you're at 99.995, average uptime. That's the number. Now we compared that to, you compared that to Salesforce, which is 99.8, Workday 99.5. That's one of those departmental clouds. We don't have to go through them all. NetSuite actually pretty good, 99.6 and Amazon 99.5. But your other point was that's only part of the story. You have to include planned downtime. So that's unplanned downtime. What about planned downtime? And you guys, very low, right? Six hours compared to the other guys. Some as high as 68 hours or over 100 hours. That's per quarter, right? Yeah, and that's back to, again, that architecture stuff, right? So we are a single instance architecture, which means we routinely fail somebody over from their primary to their backup in order to perform maintenance on the primary site. So the only downtime they're going to experience is the time invoke during that failover, which is a very short period of time. The other folks out there that are multi-tenant, doing that failover is a very risky and a very scary thing. And in fact, some of those companies don't do it as a result. They'll tell you they have a high availability design, but they really don't use it because it's such a scary factor. So they have to take these big outages every week to go do maintenance on stuff and then bring it all back up. And then the other two key metrics, RTO and RPO, for those of you who don't know what they mean, recovery time objective, recovery point objective, recovery time objective is how long it takes to get the application back up and running, recovery point objective is how much data you're actually going to lose in hours, essentially. And your RTO was two hours and your RPO is one hour. And pretty much everybody else, well, workday was an hour for RPO. A lot of people aren't published, but on RTO, recovery time objective, we're talking 12 hours. The other guys aren't published. Amazon Glacier, a poke at them, that's their archiving service. That could be, who knows, weeks. But you're talking about a pretty tight RTO and RPO. Yeah, the RPO is kind of a funny one, but let me first talk about RTO. So RTO is the one that really matters. That's how long am I going to be down when I have an outage, right? And again, as I was talking about a second ago, because we will fail you over, your downtime with us is actually quite fast. Those other guys have to do fix in place. So their recovery time objective is either 12 hours or greater or not even published where we're able to get you back up and running in minutes, quite a difference. Then you get to the recovery point objective where they have a similar time to us, but it's kind of a goofy time because if they're going to be down for 12 hours and you're not able to write new data to the system, how could they say they've got a recovery point objective less than 12 hours? Yeah, right. Okay, and then now, this is the real exciting piece. Let's talk about your real availability dashboard. So you guys say, okay, you've been reporting like everybody else has been reporting, which is essentially, if I ping the server, it's up. That's right. But you guys are changing that mentality. Talk about that a little bit. Yeah, that's all right. So we're a very customer-centric organization and so it always really bothered us that we would sort of show customers, hey, we're 99. whatever percent availability and some of them would roll their eyes and say, well, no, you're not. Well, why not? Well, we have these other issues too. So we're taking that head on. The way we are now going to talk about availability is actually the true customer experience of availability. And so that includes the ping thing that we talked about, but it also includes software errors on our part or performance problems that prevent us from actually delivering the service. You see this in other companies' products where they'll say, stand by for technical difficulties or page won't load. That's a situation where they were able to ping the system but it wasn't actually able to deliver value. But we're not gonna stop there. We're also going to reflect third-party network issues and even customer issues that they've actually done to create an outage in the instance or their inability to actually access the instance. So when a customer goes to their personalized availability dashboard at ServiceNow, which, by the way, is something only we can do because we are single instance, the other guys that have availability dashboards, they're talking about the whole family of people that are on the same database. But when they go to our dashboard, it's gonna reflect their real experience with the product, not ours. Yeah, so in the analyst meeting on Monday, you use that example of check back later, we can't get to the application right now. You said, is this up? And then I yelled out, no, and you said, well. That was you. But in reality, it would be measured as not as downtime, it would be by most, but in your metric, it would be measured as down. If my internet's down, same thing, right? Any reason that I can't get to the application. That's exactly right. And in fact, if you just wake up today and say you're mad at something and you want to file a P1, it's going to get reflected there as well. Yeah, okay, so. It's your data, it's the collective experience that you actually are feeling. So even if it's not your fault, essentially. That's exactly right. You're measuring it. That's the key. To an end user, that's where the value is. Now you showed the real availability dashboard. You showed a demo, and I wrote down 99.97. That was one customer example, is that right? That's an example we showed today. So will you begin, so you're going to expose that on a customer by customer basis? Will you expose that? It's there today. Any customer can go to their homepage, which has been there forever. It's where they would go and file incidents and do some of the other things we showed in the demo. But now on their homepage, it's actually their personalized availability dashboard. It's because we have the CMDB, because we map every instance to every customer. It's a very easy thing for us to produce. And it's there, and it's live, and it's real time. Okay, we got to go, but I got one last word here. I'm nervous about my prospect. I'm nervous about security. Can you definitively say that your security is going to be better than my on-premise security? No, I can't, I can't. But what I would like to volunteer is, let's have a discussion, right? Let's compare what we do for security with what you do for security, customer does for security. And then let's make a joint decision. The same posture that we just described around customer availability, where we're going to be very transparent, very forthright. We're going to be a partner with our customers. We are applying the same sort of philosophy to security. We're not going to claim that we're better than anybody, but I think we do have a simpler problem to solve than many of our customers. I made the point in the keynote that we have a very homogenous infrastructure. We do one thing. We host ServiceNow applications. Many of our customers are hosting thousands and thousands of applications throughout their infrastructure. What that means to me is we probably have fewer opportunities to get tripped up than some of our customers do. But ultimately, it's a customer's decision. They need to be comfortable with what they're going to put in our cloud. Let's be educated. Let's talk about what we have and then make the call. Yeah, I like that answer. It's not a, I always ask that question. And then I think the right response is, let's have that conversation. Really, you can't say one's better than the other. You have to have a detailed, deep discussion about it. And it's got to be transparent. It's got to be auditable, ideally. All right, Dan, thanks very much for coming to theCUBE. Great to see you. Nice to see you all today. All right, keep it right there. Everybody will be back with our next guest. We're live. This is theCUBE from Knowledge 14. We're right back.