 It's theCUBE, covering the Virtual Vertica Big Data Conference 2020, brought to you by Vertica. Hello everybody, welcome back to theCUBE's coverage of the Vertica Big Data Conference, the Virtual BDC. My name is Dave Vellante and you're watching theCUBE. We're here with Joe Gonzalez, who's a Vertica DBA at MassMutual Financial. Joe, thanks so much for coming to theCUBE. I'm sorry that we can't be face to face in Boston, but at least we're being responsible. So thank you for coming on. Thank you for having me, it's nice to be here. Yeah, so let's set it up. Talk a little bit about MassMutual. Everybody knows Big Financial Firm, but what's your role there and kind of your mission? So my role is Vertica DBA. I was hired January of last year to come on and manage their Vertica cluster. They've been on Vertica for probably about a year and a half before that, started out on an on-prem cluster and then moved to AWS Enterprise in the cloud and brought me on just as they were considering transitioning over to Vertica's EON mode. And they didn't really have anybody dedicated to Vertica. Nobody who really knew and understood the product. And I've been working with Vertica for about probably six, seven years at that point. I was looking for something new and landed a really good opportunity here with a great company. Yeah, you have a lot of experience in Vertica. You've had a role as a market research, so you're a data guy, right? I mean, that's really what you've been your entire career. I am, I worked with the Pitney Bowes in the postage industry. I worked with healthcare auditing about seven years in market research and then I've been with MassMutual for a little over a year now. Yeah, quite so. So tell us a little bit about kind of what your objectives are at MassMutual, what you're kind of doing with the platform, what applications you're supporting, paint a picture for us if you would. Certainly. So my role is MassMutual just decided to make Vertica its enterprise data warehouse. So they've really bought into Vertica and we're moving all of our data there probably about the good 80, 90% of MassMutual's data is going to be on the Vertica platform in EON mode. And we have a wide usage of that data across the corporation. Right now we're about 50 terabytes and growing quickly and a wide variety of uses. So there's a lot of ETLs coming in overnight, loading a lot of data, transforming a lot of data and a lot of reporting tools you're using it, currently Tableau, MicroStrategy, we have all Terricks using it and we also have APIs running against it throughout the day, 24-7 with people coming in, especially now these days with some financial uncertainty going on, a lot of people coming in checking their 401ks, checking their insurance and status and whatnot. So we have to handle a lot of concurrent traffic on top of the normal big query. So it's a quite diverse cluster and I'm glad they're really investing in using Vertica as their overall solution for this. Yeah, I mean, these days you're checking your 401k like this, right? I'm afraid to look. So I wonder, Joe, if you could share with our audience, I mean, for those who might not be as familiar with the history of just Vertica and specifically, but MPP, you've had historically, you have traditional RDBMS, whether it's DB2 or Oracle, and then you had a spate of companies that came out with this notion of MPP, Vertica is the one that I think is probably one of the few, if only brands that there's survived, but what did that bring to the industry and why is that important for people to understand just in terms of whatever the scale, performance cost, can you explain that? To me, it actually brought scale at good cost and that's why I've been a big proponent of Vertica ever since I started using it. There's a number, like you said, of different platforms where you can load big data and store and house big data, but the purpose of having that big data is not just for it to sit there, for it to be used and used in a variety of ways. And that's from something smaller, like the first installation I was on was about 10 terabytes and I work with the data warehouses up to 100 terabytes and the Vertica installations with hundreds of petabytes on them, you wanna be able to use that data. So you need a platform that's gonna be able to access that data and get it to the clients, get it to the customers as quickly as possible and not paying arm and a leg for the privilege to do so and Vertica allows companies to do that, not only get their data to clients and in company users quickly, but save money while doing so. So why couldn't I just use a traditional RDBMS? Why not just throw it all into Oracle? One cost, Oracle is very expensive, where Vertica is a lot more affordable than that, but the column store structure of Vertica allows for a lot more optimized queries. Some of the queries that you can run in Vertica in two, three, four seconds will take minutes and sometimes hours in an RDBMS like Oracle, like SQL Server. They have the capability to store that amount of data, no question, but the usability really lacks when you start querying tables that are 180 billion column, 180 billion rows, rather. Tables in Vertica that are over a thousand columns, those will take hours to run on a traditional RDBMS and then running them in Vertica. I get my queries back in set. You know what's interesting to me, Joe, and I wonder if you could comment. It seems that Vertica has done a good job of embracing, you know, riding the waves. Whether it was HDFS and the big data, you know, early part of the big data era, the machine learning, machine intelligence, whether it's, you know, TensorFlow and other data science tools, it seems like Vertica somehow, and the cloud is the other one, right? A lot of times cloud is super disruptive, particularly the companies that started on-prem. It seems like Vertica somehow has been able to adopt and embrace some of these trends. Why from your standpoint? First of all, from your standpoint as a customer, is that true and why do you think that is? Is it architectural? Is it sort of mindset engineering? I wonder if you could comment on that. It's absolutely true. I've started out again on an on-prem Vertica data warehouse and we kind of, you know, rolled kind of along with them, you know, more and more people have been using data. They want to make it accessible to people on the web now and, you know, having the option to provide that data from an on-prem solution from AWS is key. And now Vertica is offering even a hybrid solution. You want to keep some of your data behind a firewall on-prem and put some in the cloud as well. So Vertica has absolutely evolved along with the industry in ways that no other company really has that I've seen. And I think the reason for it and the reason I've stayed with Vertica and specifically have remained a Vertica DBA for the last seven years is because of the way Vertica stays in touch with its customers. I've been working with the same people for the seven, eight years I've been using Vertica. They're a family. I'm part of their family. And, you know, I'm good friends with some of these people and they really are in tune, not only with the customer, but what they're doing. They really sit down with you and have those conversations about, you know, what are your needs? How can we make Vertica better? And they listen to their clients. You know, just having access to the data engineers who develop Vertica to be arranged on a phone call or whatnot. I've never had that with any other company. Vertica makes that available to their customers when they need it. So the personal touch is a huge to them. That's good. It's always good to get the confirmation from the practitioners just not here from the vendor. I want to ask you about the Eon transition. You mentioned that MassMutual brought you in to help with that. What were some of the challenges that you faced? And how did you get over them? And what is, why Eon, you know, what was the goal, the outcome and some of the challenges maybe that you had to overcome? Right. So MassMutual had an interesting setup when I first came in. They had three different Vertica clusters to accommodate three different portions of their business. The data scientists who use the data quite extensively in very large queries, very intense queries, you know, work with their predictive analytics and whatnot. It was a separate one for the API, which needed, you know, sub-second query response times and with the enterprise solution, they weren't always able to get the performance they needed because the fast queries were being overrun by the larger queries that needed more resources. And then they had a third for starting to develop this enterprise data platform and start, you know, looking into their future. The first challenge was, first of all, bringing all those three together and back into a single cluster and allowing our users to have both of the heavy queries and the API queries running at the same time on the same platform without having to completely separate them out onto different clusters. Eon really helps with that because it allows you to store that data in the history communal storage, have the main cluster set up to run the heavy queries and then you can set up sub-clusters that still point to that history data but separates out the compute so that the APIs really have their own resources to run and not be interfered with by the other processes. Okay, so I'm hearing a couple of things. One is you're sort of busting down data silos. So you're able to have a much more coherent view of your data, which I would imagine is critical. Certainly companies like MassMutual have been around for a hundred years and so you've got all kinds of data dispersed. So to the extent that you can break down those silos, that's important. But also being able to, I guess, have granular increments of compute and storage is what I'm hearing. What does that do for you? It makes it more efficient. Are there other business benefits? Maybe you could elucidate. Well, one cost is, again, a huge benefit. The cost of running three different clusters in even AWS and the enterprise solution was a little costly. You had to have your dedicated servers here and there. So you're paying for like 12, 15 different servers, for example, whereas we bring them all back into EON, I can run everything on a six node production cluster and when things are busy, I can spin up a three node sub-cluster for the APIs, only pay for when I need them and then bring them back into the main cluster when things are slowed down a bit and they can get that performance that they need. So let's say it's a ton on resource costs. You're not paying for the storage, you're paying for one S3 bucket. You're only paying for the nodes, the EC2 instances that are up and running when you need them and that is huge. And again, like you said, it gives us the ability to silo our data without having to completely separate our data into different storage areas, which is a big benefit. It gives us the ability to query everything from one single cluster without having to synchronize it to three different ones. So this one can have theirs, this one can have theirs, but everyone's still looking at the same data and replicate that in QA and Dev so that people can do it outside of production and do some testing as well. So EON obviously a very important innovation and of course, Vertica touts the difference between others who separate huge storage and they're not the only one that does that, but they are really, I think the only one that does it for on-prem and virtually across clouds. So my question is, and I think you're doing a breakout session on the virtual BDC, we're going to be in Boston, now you're doing it online. If I'm in the audience, I'm imagining I'm a junior DBA at an organization that maybe doesn't have a Joe. I haven't been an expert for seven years. How hard is it for me to get, what do I need to do to get up speed on EON? It sounds great, I want it. I'm going to save my company money, but I'm nervous because I've only been a Vertica DBA for a year and I'm sort of not as experienced as you. What are the things that I should be thinking about? Do I need to bring in, do I need to hire somebody? Do I need to bring in a consultant? Can I learn it myself? What would you advise? It's definitely easy enough that if you have at least a little bit of Vertica experience, you can learn it yourself, okay? The concepts are still there. There's some little bits of nuances where you do need to be aware of certain changes between the enterprise and the EON edition, but I would also say consult with your Vertica account manager. Consult with, let them bring in the right people from Vertica to help you get up to speed. And if you need to, there are also resources available as far as consultants go that will help you get up to speed very quickly. And we did work together with Vertica and with one of their partners, Clarity, in helping us to understand EON better, set it up the right way. How do we pick the number of shards for our data warehouse? They help us evaluate all that and pick the right number of shards, the right number of nodes to get set up and going. And help us figure out the best ways to get our data over from the enterprise edition to EON very quickly and very efficiently. So, get yourself. I wanted to ask you about organizational issues because guys like you, practitioners always tell me, look, the technology comes and goes, that's kind of the easy part. We're good at that. It's the people, it's the process, it's the skillsets. What does your team regime look like? And do you have any sort of ideal team makeup or ideal advice? Is it two pizza teams? Is it what kind of skills? What kind of interaction and communications to senior leadership? I wonder if you could just give us some color on that. One of the things that makes me extremely proud to be working for MassMutual right now is that they do what a lot of companies have not been doing and that is invest in IT. They have put a lot of thought, a lot of money and a lot of support into setting up their enterprise data platform and putting Vertica at the center. And not only did they put the money into getting the software that they needed like Vertica, MicroStrategy and all the other tools that we're using to use that, they put the money in the people. Our managers are extremely supportive of us. We hired about 40 to 45 different people within a four month timeframe, data engineers, data analysts, data modelers, a nice mix of people across who can help shape your data and bring the data in and help the users use the data properly and allow me as the database administrator to make sure that they're doing what they're doing most efficiently and focus on my job. So you have to have that diversity among the different data skills in order to make your team successful. That's awesome. Kind of a side question and it's really not Vertica's wheelhouse, but I'm curious, in the early days of the big data movement, a lot of the data scientists would complain and they still do that 80% of my time is spent wrangling data, the tools for the data engineer, the data scientists, the database experts, they're all different and is that changing and to what degree is that changing? Kind of what inning are we in in terms of a more facile environment for all those roles? Again, I think it depends on company to company, what resources they make available to the data scientists and the data scientists, we have a lot of them at MassMutual and they're very much into doing a lot of the machine learning, model training, predictive analytics and they are used to doing it outside of Vertica too, pulling that data out into Python and Scala's bar, tools like that. And they're also now just getting into using Vertica's in database analytics and machine learning, which is a skill that definitely nobody else out there has. So being able to have one, somebody who understands Vertica like myself and being able to, pardon me, being able to train other people to use Vertica the way that is most efficient for them is key, but also just having people who understand not only the tools that you're using, but how to model data, how to architect your tables, your schemas, the interaction between your tables and schemas and whatnot, you need to have that diversity in order to make this work. And our data scientists have benefited immensely from the structure that MassMutual put in place for our data management delivery team. That's great. I think I saw somewhere in your background that you've trained about 100 people in Vertica. Did I get that right? Yes, since I started here, I've gone to our Boston location, our Springfield location and our New York City location and trained probably about this point, about 120, 140 of our Vertica users. And I'm trying to do a couple of follow-up sessions per year. So adoption obviously is a big goalie or is getting people to adopt the platform, but then more importantly, I guess deliver business value and outcomes. Absolutely. I wanted to ask you about encryption. In the perfect world, everything would be encrypted, but there are trade-offs. Are you using encryption? What are you doing in that regard? We are actually just getting into that now due to the New York and the CCPA regulations that are now in place. We do have a lot of personally identifiable information in our data store that does require encryption. So we are going through a months-long process that started in December. I think it was actually a bit earlier than that. To start identifying all the columns, not only in our Vertica database, but in the other databases that we do use. We have a Postgres database, SQL Server, Teradata for the time being until that moves into Vertica. And identify where that data sits, what downstream applications pull that data from the data sources and store it locally as well and start encrypting that data. And because of the tight relationship between voltage and Vertica, we settled on voltage as the major platform to start doing that encryption. So we're going to be implementing that in Vertica probably within the next month or two and roll it out to all the teams that have data that requires encryption. We're going to start rolling it out to the downstream application owners to make sure that they are encrypting the data as they get pulled over. And we're also using another product for several other applications that don't mesh well as well with voltage. Voltage being micro-focus is the encryption solution, correct? Correct, yes. Yeah, of course, micro-focus for the audiences is the, it owns Vertica and even though Vertica is a separate brand. So I want to ask you kind of close on what success looks like. You've been at this for a number of years, coming into MassMutual is great to hear. I've had some past experience with MassMutual, the awesome company, I've been to the Springfield facility and in Boston as well, and I have great respect for them. And they've really always been a leader. So it's great to hear that they're investing in technology as a differentiator. What does success look like for you? Let's say you're at MassMutual for a few years, you're looking back, what's success look like, Joe? Good question, it's changing every day, just with more and more applications coming on board, more and more data being pulled in, more uses being found for the data that we have. I think success for me is making sure that Vertica, first of all, is always running at its most optimal to keep our users happy. I think when I started, we had a lot of processes that were running six, seven hours, some of them were taking almost a day long because they were so complicated. We've got those running in under an hour now, some of them running in a matter of minutes. I want to keep that optimization going for all of our processes. Like I said, there's a lot of users using this data and it's been hard over the first year me being here to get to all of them. And thankfully, I'm getting a bit of help now. I have a couple of assistant DBAs and I'm training up to help out with these optimizations, fixing queries, fixing projections to make sure that queries do run as quickly as possible. So getting that to its optimal stage is one. Two, getting our data encrypted and protected so that even if for whatever reason, somehow somebody breaks into our data, they're not going to be able to get anything at all because our data is 100% protected. And I think more companies need to be focusing on that as well. And third, I want to see our data science teams using more and more of Vertica's in-database predictive analytics, in-database machine learning products and really helping make their jobs more efficient by doing so. Joe, you're awesome guest. I mean, we always, like I said, love having the practitioners on and getting the straight skinny and potent. You're welcome back anytime. As I say, I wish we could have met in Boston maybe next year at the BDC, but it's great to have you online. And thanks for coming on theCUBE. And thank you for having me and hopefully we'll meet next year. I hope so. And thank you everybody for watching. Now remember, the CUBE is running concurrent with the Vertica virtual BDC. It's vertica.com slash BDC 2020. If you want to check out all the keynotes and all the breakout sessions, I'm Dave Vellante from theCUBE. We'll be going more interviews. So keep it right there. Thanks for watching.