 It's theCUBE! Covering the Virtual Vertica Big Data Conference 2020 brought to you by Vertica. Welcome back everybody, my name is Dave Vellante and you're watching theCUBE's coverage of the Vertica Virtual Big Data Conference. theCUBE has been at every BDC and it's our pleasure in these difficult times to be covering BDC as a virtual event. This digital program really excited to have Joy King joining us. Joy is the vice president of product and go-to-market strategy at Vertica. And if that weren't enough, he also runs marketing and education programs. So Joy, you're a multi-tool player. You've got the technical side and the marketing gene. So welcome to theCUBE, you're always a great guest. Love to have you on. Thank you so much, David, it's a pleasure, it really is. So I want to get in, we'll have some time if we've been talking about the conference and the virtual event, but I really want to dig in to the product stuff, it's a big day for you guys, you announced 10.0, but before we get into the announcements, step back a little bit. You know, you guys are riding the waves. I've said to a number of our guests that Vertica's always been good at riding the wave. Not only the initial MPP, but you embraced HTFS, you embraced data science and analytics and the cloud. So what are the trends that you see, the big waves that you're riding? Well, you're absolutely right, Dave. I mean, what I think is most interesting and important is because Vertica is at its core a true engineering culture founded by, well, a pretty famous guy, right? Dr. Stonebreaker, who embedded that very technical Vertica engineering culture. It means that we don't pretend to know everything that's coming, but we are committed to embracing the technology trends, the innovations, things like that. We don't pretend to know it all, we just do it all. So right now, I think I see three big imminent trends that we are addressing and that are, we have been for a while, but that are particularly relevant right now. The first is a combination of, I guess it's disappointment in what Hadoop was able to deliver. I always feel a little guilty because she's a very reasonably capable elephant. She was designed to be HTFS, highly distributed file store, but she can't be an entire zoo. So there's a lot of disappointment in the market, but a lot of data in HTFS. You combine that with some of the, well, not some, the explosion of cloud object storage. You're talking about even more data, but even more data silos. So data growth and data silos is trend one. Then what I would say trend two is the cloud reality. Cloud brings so many events. There are so many opportunities that public cloud computing delivers, but I think we've learned enough now to know that there's also some realities. The cloud providers themselves, Dave, don't talk about, oh, it's cheaper. Well, because it's not, is it more agile? Can you do things without having to manage your own data center? Of course you can, but the reality is it's a little more pricey than we expected. There are some security and privacy concerns. There's some workloads that can't go to the cloud. So hybrid and also multi-cloud deployments are the next trend that are mandatory. And then maybe the one that is the most exciting in terms of changing the world, we could use a little change right now, is operationalizing machine learning. There's so much potential in the technology, but it somehow has been stuck, for the most part, in science projects and data science labs, and the time is now to operationalize it. Those are the three big trends that Vertica is focusing on right now. That's great. I wonder if I could ask you a couple of questions about that. I mean, I like you have a soft spot in my heart for the elephant. And the thing about the Hadoop that was, I think profound was it got people thinking about bringing compute to the data and leaving data in place, and it really got people thinking about data-driven cultures. It didn't solve all the problems, but it collected a lot of data that we can now take your third trend and apply machine intelligence on top of that data. And then the cloud is really the ability to scale, and it gives you that agility, and it's really that cloud experience. It's not just the cloud itself. It's bringing the cloud experience to wherever the data lives. And I think that's what I'm hearing from you. Those are the three big superpowers of innovation today. That's exactly right. So, you know, I have to say, I think we all know that data analytics, machine learning, none of that delivers real value unless the volume of data is there to be able to truly predict and influence the future. So the last seven to 10 years has been correctly about collecting the data, getting the data into a common location, and HDFS was well designed for that. But we live in a capitalist world and some companies stepped in and tried to make HDFS and the broader Hadoop ecosystem be the single solution to big data. It's not true. So now the key is how do we take advantage of all of that data? And now that's exactly what Vertica is focusing on. So as you know, we began our journey with Vertica back in 2007 with our first release and we saw the growth of Hadoop. So we announced many years ago Vertica SQL on Hadoop. The idea to be able to deploy Vertica on Hadoop nodes and query the data in Hadoop. We wanted to help. Now with Vertica 10, we are also introducing Vertica in EON mode. And we can talk more about that, but Vertica in EON mode for HDFS. This is a way to apply an ANSI SQL database management platform to HDFS infrastructure and data in HDFS file storage. And that is a great way to leverage the investment that so many companies have made in HDFS. And I think it's fair to the elephant to treat her well. Okay, well, let's get it to the hard news. 10.0, you've got a mature stack. What are the highlights of 10.0 and then we can drill into some of the particulars? Absolutely. So in, well in 2018, Vertica announced Vertica in EON mode. EON mode is the separation of compute from storage. Now this is a great example of Vertica embracing innovation. Vertica was designed for on-premises data centers and bare metal servers, tightly coupled storage, DL380s from Hewlett Packard Enterprise, Dell, et cetera. But we saw that cloud computing was changing fundamentally data center architectures. And it made sense to separate compute from storage. So you add compute when you need compute, you add storage when you need storage. That's exactly what the clouds introduced, but it was only available on the cloud. So first thing we did was architect Vertica in EON mode, which is not a new product, Dave. This is really important. It's a deployment option. And in 2018, our customers had the opportunity to deploy their Vertica licenses in EON mode on AWS. In September of 2019, we then broke an important record. We brought cloud architecture down to earth and we announced Vertica in EON mode. So Vertica with communal or shared storage, leveraging pure storage FlashBlade. That gave us all the advantages of separating compute from storage. All of the workload isolation, the scale up, scale down, the ability to manage clusters. And we did that with on-premise data centers. And now with Vertica 10, we are announcing Vertica in EON mode on HDFS and Vertica EON mode on Google Cloud. So what we've got here in summary is Vertica in EON mode, multi-cloud and multiple on-premise data storage. And that gives us the opportunity to help our customers, both with the hybrid and multi-cloud strategies they have and unifying their data silence. But Vertica 10 goes farther. Well, so let me stop you there because I just want to mention. So we talked to Joe Gonzalez at MassMutual who essentially he was brought in. And one of his tasks was to lean into EON mode. Why? MassMutual had three separate data silos and they wanted to bring those together. They're investing heavily in technology. Joe's an expert, so that really put data at their core and EON mode was a key part of that because they're using S3 and so, so that was a very important step for those guys. But carry on, what else do we need to know about 10.0? Well, so one of the reasons for example that MassMutual is so excited about EON mode is because of the operational advantages. You think about exactly what Joe told you about multiple clusters serving multiple use cases and maybe multiple divisions. And look, let's be clear, marketing doesn't always get along with finance and finance doesn't necessarily get along with ops. And IT is often caught in the middle. Vertica in EON mode allows workload isolation, meaning allocating the compute resources that different use cases need without allowing them to interfere with other use cases and allowing everybody to access the data. So it's a great way to bring the corporate world together but still protect them from each other. And that's one of the things that MassMutual is going to benefit from as well so many of our other customers. I also wanna mention, so when I saw you last year at the Pure Storage Accelerate Conference, you said, Dave, we are the only company that separates compute from storage that runs on-prem and in the cloud. And I was like, hmm, and I had to think about it. I've researched it. I still can't find anybody else who does it. I want to mention, you beat actually a number of the cloud players with that capability. So, good job, and I think is a differentiator assuming that you're giving me that cloud experience and the licensing and the pricing capability. So I want to talk about that a little bit. Well, you're absolutely right. So let's be clear. There is no question that the public cloud, public clouds introduce the separation of compute storage and these advantages, but they do not have the ability or the interest to replicate that on-premise. For Vertica, we were born to be software only. We make no money on underlying infrastructure. We don't charge as a package for the hardware underneath. So we are totally motivated to be independent of that and also to continuously optimize the software to be as efficient as possible. And we do the exact same thing to your question about licensing. Cloud providers charge per note. It makes sense. That's how they charge for their underlying infrastructure. Well, in some cases, if you're talking about a use case where you have a whole lot of data, but you don't necessarily have a lot of compute for that workload, it may make sense to pay per note. Then it's unlimited data. But what if you have a huge compute need on a relatively small dataset? That's not so good. Vertica offers per note and per terabyte for our customers depending on their use case. We also offer perpetual licenses for customers who want CAPEX, but we also offer subscription for companies that say, nope, I have to have APEX. And while this can certainly cause some complexity for our field organization, we know that it's all about choice, that everybody in today's world wants it personalized, just for me. And that's exactly what we're doing with our pricing and licensing. So just to clarify, you're saying, I can pay by the drink if I want to. You're not going to force me necessarily into a term or I can choose to have a more predictable pricing. Is that correct? Well, so it's partially correct. So first, Vertica subscription licensing is a fixed amount for the period of the subscription. We do that because so many of our customers cannot, and I'm one of them, by the way, cannot tell finance what the budget forecast is going to be for the quarter after I spend. I'm required to say what it's going to be before. So our subscription pricing is a fixed amount for a period of time. However, we do respect the fact that some companies do want usage-based pricing. So on AWS, you can use Vertica by the hour, and you pay by the hour. We are about to launch the very same thing on Google Cloud. So for us, it's about what do you need and we make it happen, natively, directly with us or through AWS and Google Cloud. So I want to understand. So the fixed is some floor, and then if you want to surge above that, you can allow usage pricing if you're on the cloud, correct? You actually license your cluster, Vertica by the hour on AWS, and you run your cluster there. Or you can buy a license from Vertica for a fixed capacity or a fixed number of nodes and deploy it on the cloud. And then if you want to add more nodes or add more capacity, you can. It's not usage-based for the license that you bring to the cloud, but if you purchase through the cloud provider, it is usage-based. Yeah, okay, and you guys are in the marketplace. Is that right? That's right, exactly. So if I want OPEX, I can do that. I can choose to do that. That's awesome. You can choose OPEX usage through the AWS marketplace or OPEX fixed directly from Vertica. Yeah, because every small business who then goes to a Salesforce management system knows this. You're like, okay, great, I can pay by the month. Well, yeah, well, not really. Here's our three-year term, right? And it's very frustrating. Well, and even in the public cloud, you can pay for by the hour or by the minute or whatever, but it becomes pretty obvious that you're better off if you have reserved instance types or committed amounts. And that's why Vertica offers subscription that says, hey, you wanna have 100 terabytes for the next year. Here's what it will cost you. We do interval billing. You wanna do monthly, quarterly, biannual. We'll do that, but we won't charge you for usage that you didn't even know you were using until after you get the bill. And frankly, that's something my finance team does not like. Yeah, I think, I know this is kind of a wonky discussion, but so many people gloss over the licensing and the pricing and I think my takeaway here is optionality, pricing your way. That's great. Thank you for that clarification. Okay, so you got Google Cloud. I wanna talk about storage optionality. If I count them up, I got S3. I got, I'm presuming Google now. You're pure. So Google is an S3 compatible storage, yeah. So pure storage. So the Google object store, right? Like Google object store, Amazon S3 object store, HDFS, pure storage FlashBlade, which is an object store on-prem. And we are continuing on this path because ultimately we know that our customers need the option of having next generation data center architecture, which is sort of shared or communal storage. So all the data is in one place, but workloads can be managed independently on that data. And that's exactly what we're doing, but what we already have in two public clouds and two on-premise deployment options today. And as you said, I did challenge you back when we saw each other at the conference. Today, Vertica is the only analytic data warehouse platform that offers that option on-premise and in multiple public clouds. Okay, let's talk about the innovation cocktail, I'll call it. So it's the data applying machine intelligence to the data and we talked about scaling at cloud and some of the other advantages of cloud. Let's talk about the machine intelligence, the machine learning piece of it. What's your story there? Give us any updates on your embracing of tooling and the like. Well, quite a few years ago, we began building some in-database, native in-database machine learning algorithms into Vertica. And the reason we did that was we knew that the architecture of MPP columnar execution would dramatically improve performance. We also knew that a lot of people speak SQL, but at the time, not so many people spoke R or even Python. And so what if we could give access to machine learning in the database via SQL and deliver that kind of performance? So that's the journey we started on. And then we realized that actually machine learning is a lot more, as everybody knows, than just algorithms. So we then built in the full end-to-end machine learning function from data preparation to model training, model scoring and evaluation all the way through to full deployment. And all of this again, SQL accessible. You speak SQL, you speak to the data. And the other advantage of this approach was we realized that accuracy was compromised if you down sample. If you moved a portion of the data from a database to a special team machine learning platform, you were challenged by accuracy and also what the industry is calling replicability. And that means if a model makes a decision, like let's say credit scoring, and that decision is in any way challenged, well, you have to be able to replicate it to prove that you made the decision correctly. And there was a bit of a, you know, blow up in the media not too long ago about a credit scoring decision that appeared to be gender biased. But unfortunately, because the model could not be replicated, there was no way to disprove that. And that was not a good thing. So all of this is built into Vertica. And with Vertica 10, we've taken the next step. Just like with Hadoop, we know that innovation happens within Vertica but also outside of Vertica. We saw that data scientists really love their preferred language like Python. They love their tools and platforms like TensorFlow. With Vertica 10, we now integrate even more with Python, which we have for a while, but we also integrate with TensorFlow integration and PMML. What does that mean? It means that if you build and train a model external to Vertica, using the machine learning platform that you like, you can import that model into Vertica and run it on the full end-to-end process but run it on all the data. No more accuracy challenges, MPP columnar execution so it's blazing fast. And if somebody wants to know why a model made a decision, you can replicate that model and you can explain why. Those are very powerful. And it's also another cultural unification, Dave. It unifies the business analyst community who speaks SQL with the data scientist community who love their tools like TensorFlow and Python. Well, I think, Joy, that's important because so much of machine intelligence and AI, there's a black box problem that you can't replicate the model. Then you do run into potential like gender, by the example that you're talking about there and there are many, let's say an individual is very wealthy, he goes for a mortgage and his wife goes for some credit, she gets rejected, he gets accepted. It's the same household. But the bias in the model, there may be gender bias, there could be race bias. And so being able to replicate that and open up and make the machine intelligence transparent is very, very important. It really is and that replicability as well as accuracy is critical because if you're down sampling and you're running models on different sets of data, things can get confusing. And yet you don't really have a choice because if you're talking about petabytes of data and you need to export that data to a machine learning platform and then try to put it back and get the next set the next day, you're looking at way too much time. Doing it in the database or training the model and then importing it into the database for production, that's what Vertica allows and our customers are so excited. They, because of course you know, they are the ones that are sort of the trailblazers they've always been and this is the next step in blazing the ML trail. Joy, customers want analytics. They want functional analytics, full function analytics. What are they pushing you for? And what are you delivering? What's your thought on that? Well, I would say the number one thing that our customers are demanding right now is deployment flexibility. What the CIO or the CFO mandated six months ago, thou shalt, whatever that thou shalt is, is different. What I tell them is, it is impossible to know what you're going to be commanded to do or what options you might have in the future. The key is not having to choose and they are very, very committed to that. We have a large telco customer who is multi-cloud as their commit. Why multi-cloud? Well, because they see innovations available in different public clouds. They want to take advantage of all of them. They also admittedly see that there's the risk of lock-in, right, like any vendor. They don't want that either. So they want multi-cloud. We have other customers who say, we have some workloads that make sense for the cloud and some that we absolutely cannot do in the cloud. But we want a unified analytics strategy. So they are adamant in focusing on deployment flexibility. That's what I'd say is first. Second, I would say that the interest in operationalizing machine learning but not necessarily forcing the analytics team to hammer the data science team about which tools are the best tools. That's the probably number two. And then I'd say number three. And it's because when you look at companies like Uber or the trade desk or AT&T or Cerner performance at scale, when they say milliseconds, they think that's low. When they say petabytes, they're like, yeah, that was yesterday. So performance at scale, good enough for Vertica is never good enough. And it's why we're constantly building at the core, the next generation execution engine, database designer, optimization engine, all that stuff. I also want to ask you, when I first started following Vertica, we covered the cube covering the BDC. One of the things I noticed was in talking to customers and people in the community is that you have a community addition. It's a free addition and it's not neutered. Have you maintained that ethos through the transitions into micro focus? And can you talk about that a little bit? Absolutely. Vertica Community Edition is Vertica. It's all of the Vertica functionality. Geospatial, time series, pattern matching, machine learning, all of the Vertica. Vertica in EON mode, Vertica in enterprise mode. All Vertica is the Community Edition. The only limitation is one terabyte of data and three nodes. And it's free. Now, if you want commercial support where you can file a support ticket and things like that, you do have to buy the light, but it's free. And people say, well, free for how long? Like, our field sometimes asks me that. And I say forever. And so what do you mean forever? Because we want people to use Vertica for use cases that are small, that they want to learn, that they want to try. And we see no reason to limit that. And what we look for is when they're ready to grow, when they need the next set of data that goes beyond a terabyte, or they need more compute than three nodes, then we're here for them. And it also brings up an important thing that I should remind you or tell you about, Dave, if you haven't heard it. And that's about the Vertica Academy, academy.vertica.com. Well, what is that? That is all self-paced on demand, as well as Vertica Essential Certification Training. And certification means you have seven days with your hands on a Vertica cluster hosted in the cloud to go through all the certifications. And guess what? All of that is free. Why? Why would you give it for free? Because for us, empowering the market, giving the market the expertise, the learning they need to take advantage of Vertica, just like with Community Edition, is fundamental to our mission. Because we see the advantage that Vertica can bring, and we want to make it possible for every company all around the world to take advantage of it. I love that ethos of Vertica. I mean, it's obviously a great product, but it's not just the product. It's the business practices and really progressive, progressive pricing and embracing of all these trends and sort of not running away from the waves, but really leaning in. Joy, thanks so much. Great interview, really appreciate it. And I wished we could have been face-to-face in Boston, but I think this is the prudent thing to do. I promise you, Dave, we will, because the Vertica BDC in 2021 is already booked. So I will see you there. Awesome, Joy King, thanks so much for coming on theCUBE. And thank you for watching. And remember, the CUBE is running this program in conjunction with the virtual Vertica BDC. Go to vertica.com slash BDC 2020 for all the coverage. And keep it right there. This is Dave Vellante for theCUBE. We'll be right back.