 Live from New York, it's theCUBE. Covering theCUBE, New York City, 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. Hello everyone, welcome back to theCUBE. Live in New York City here for CUBE NYC in conjunction with Stratoconference, Stratadata, Stratadata, Hadoop. It's our ninth year covering the big data ecosystem which has evolved into machine learning, AI, data science, cloud. A lot of great things happening, all things data, impacting all businesses. I'm John Furrier, your host with Dave Vellante and Peter Burris, Peter's filling in for Dave Vellante. Our next guest, Stephanie McRennals, who's the CMO, VP of Marketing for Elation. Thanks for joining us. Thanks for having me. Good to see you. So you guys have a pretty spectacular exhibit here in New York. I want to get to that right away. The top story is Attack of the Bots and you're showing a great demo. Explain what you guys are doing in the show. Yeah, well it's robot fighting time in our booth. So we've brought a little fun to the show floor. My kids are- Big data is not fun enough? Well big data is pretty fun, but occasionally you got to get your geek battle on there. So we're having fun with robots, but I think the real story in the Elation booth is about the product and how machine learning data catalogs are helping a whole variety of users in the organization. Everything from improving analyst productivity and even some business user productivity of data to then really supporting data scientists in their work by helping them to distribute their data products through a data catalog. You guys are one of the new guard companies that are doing things that make it really easy for people who want to use data. Practitioners, the average data citizen as they've been called or people who want productivity. Not necessarily the hardcore setting up clusters. Really kind of like the big data user. What's that market look like right now? Has it met your expectations? How's business? What's the update? Yeah, I mean I think we have a strong perspective that for us to close the final mile and get to real value out of the data, it's a human challenge, right? There's a trust gap with managers. Today on stage over at Strata, it was interesting because Google had a speaker and it wasn't their chief data officer. It was their chief decision scientist. And I think that reflects what that final mile is that it's making decisions. And it's the trust gap that managers have with data because they don't know how the insights are coming to them. What are all the details underneath? In order to be able to trust decisions, you have to understand who processed the data, what decision making criteria did they use? Was this data governed well? Are we introducing some bias into our algorithms and can that be controlled? And so Elation becomes a platform for supporting getting answers to those issues. And then there's plenty of other companies that are optimizing the performance of those queries and the storage of that data, but we're trying to really close that trust gap. Well, it's very interesting because from a management standpoint, we're trying to do more evidence-based management. So there's a major trend in boardrooms and executive offices to try to find ways to acculturate the executive team to using data, evidence-based management, health care, and now being applied to a lot of other domains. We've also historically had this situation where the people who focused or worked at the data was a relatively small coterie of individuals that created these crazy systems to try to bring those two together. It sounds like what you're doing on, I really like the idea of the decision, or the data scientist being able to create data products that then can be distributed. It sounds like you're trying to look at data as an asset to be created, to be distributed, so it can be more easily used by more people in the organization. Have you got that right? Absolutely. So we're now seeing, we're in the, just over a hundred production implementations of elation at large enterprises, and we're now seeing those production implementations get into the thousands of users. So this is going beyond those data specialists, beyond the unicorn data scientists that understand systems and math and technology. What we're seeing is- And the business. And business, right, and business. So what we're seeing now is that a data catalog can be a point of collaboration across those different audiences in an enterprise. So whereas three years ago, some of our initial customers kept the data catalog implementations small, right? They were giving access to the specialists to this catalog and asking them to certify data assets for others. What we're starting to see is a proliferation of creation of self-service data assets, a certification process that now is enterprise-wide and thousands of users in these organizations. So eBay has over a thousand weekly logins. Munich Reinsurance was on stage yesterday. They're head of data engineering, so they have 2,000 users on elation at this point on their data lake. Pfizer's going to speak on Thursday, and they're getting up to those numbers as well. So we see some really solid organizations that are solving medical pharmaceutical issues, right? The largest reinsurer in the world, leading tech companies, starting to adopt a data catalog as a foundation for how they're going to make those data-driven decisions of the organization. Tell me about how the product works, because essentially you're bringing kind of the decision scientists for lack of a better word, and the productivity worker, almost like a business office concept as a SaaS. So you got a SaaS model that says, hey, you want to play with data, use it, but you got to do some front-end work. Take us through how you guys roll out the platform, how are your customers consuming the service? I mean, take us through the engagement with customers. Yeah, I mean, I think for customers, the most interesting part of this product is that it displays itself as an application that anyone can use, right? So there's a super familiar search interface that rather than bringing back web pages, allows you to search for data assets in your organization. If you want more information on that data asset, you click on those search results, and you can see all of the information of how that data has been used in the organization, as well as the technical details and the technical metadata. And I think what's even more powerful is we actually have a recommendation engine that recommends data assets to the end user, and that can be plugged into Tableau and Salesforce Einstein Analytics and a whole variety of other data science tools like DataIQ that you might be using in your organization. So this looks like a very easy to use application that folks are familiar with, that you just need a web browser to access. But on the back end, the hard work that's happening is the automation that we do with the platform. So by going out and crawling these source systems, I'm looking at not just the technical descriptions of data, the metadata that exists, but then being able to understand by parsing the SQL web logs how that data is actually being used in the organization. We call it behavior IO by looking at the behaviors of how that data is being used from those logs. We can actually give you a really good sense of how that data should be used in the future or where you might have gaps in governing that data or how you might want to reorientate your storage or compute infrastructure to support the type of analytics that are actually being executed by real humans in your organization. And that's eye-opening to a lot of IT. You're writing insights into the data usage so that the business could get optimized for whether it's IT footprint component or kinds of use cases. Is that kind of how it's working? So what's interesting is the optimization actually happens in a pretty automated way because we can make recommendations to those consumers of data, of how they want to navigate the system. Kind of like Google makes recommendations as you browse the web, right? Yeah. You misspell something. Did you mean this kind of thing? Did you mean this? Might you also be interested in this, right? Others like you, it's kind of a cross between Google and Amazon. Others like you may have used these other data assets in the past to determine revenue for that particular region. Have you thought about using this filter? Have you thought about using this join? Did you know that you're trying to do analysis that maybe the sales ops guy has already done and here's a certified report. Why don't you just start with that? We're seeing a lot of reuse in organizations. We're in the past, I think as an industry, when Tableau and Click and all these BI tools that were very self-service oriented started to take off. It was all about democratizing visualization by levitating every user do their own thing. And now we're realizing that to get speed and accuracy and efficiency and effectiveness. Maybe there's more reuse of the work we've already done in existing data assets and by recommending those and expanding the data literacy around interpretation of those, you might actually close this trust gap with the data. But there's one really, really important point that you raised and I want to come back to it. That is this notion of bias. So you know, Alation knows something about the data and knows a lot about the metadata. So it therefore, I don't want to say understands, but it's capable of categorizing data in that way. And you're also able to look at the usage of that data by looking at some of it, by parsing some of the SQL statements and then making a determination of whether or not the data as it's identified is appropriately being used based on how people are actually applying it so that you can identify potential bias or potential misuse, whatever else it might be. That is an incredibly important thing. You know, as you know, John, we had an event last night and one of the things that popped up is how do you deal with emergence in data science, in AI, et cetera, and what methods do you put in place to actually ensure that the governance model can be extended to understand how those things are potentially, in a very soft way, corrupting the use of the data? So could you spend a little bit more time talking about that because it's something that a lot of people are interested in. Quite frankly, we don't know about a lot of tools that are doing that kind of work right now. That's an important point. I think the traditional viewpoint was if we just can manage the data, we will be able to have a governance system. So if we control the inputs, then we'll have a safe environment. That was kind of like the classic single source of truth data warehouse type model. Stewards of the data. What we're seeing is with the proliferation of sources of data and how quickly with IoT and new modern sources, data is getting created, you're not able to manage data at that point of that entry point. And it's not just about systems. It's about individuals that go on the web and find a data set and then load it into a corporate database or merge an Excel file with something that's in a database. And so I think what we see happening, not only when you look at bias, but if you look at some of the new regulations like GDPR, the logic that you're using to process that data, the algorithm itself can be biased. If you have a bias training data set, you feed it into a machine learning algorithm, the algorithm itself is going to be biased. And so the control point in this world where data is proliferating and we're not sure we can control that entirely becomes the logic embedded in the algorithm. Even if that's a simple SQL statement that's feeding a report. And so Elation is able to introspect that SQL and highlight that maybe there is bias at work in how this algorithm is composed. So with GDPR, the consumer owns their own data. If they want to pull it out from a training data set, you got to rerun that algorithm without that consumer data. And that's your control point then going forward for the organization on different governance issues that pop up. So with the psychology of the user base, because one of the things that's shifted in the data world is a few stewards of data managed everything. Now you've got a model where literally thousands of people in an organization could be users. Productivity users. So you guys have a social component in here that you know people know who's doing data work, which in a way creates a new persona or class of worker. Non-techy worker. I mean, it's interesting if you think about moving access to data and moving the individuals that are creating algorithms out to a broader user group, what's important? You have to make sure that you're educating and training and sharing knowledge with that democratized audience, right? And to be able to do that, you kind of want to work with human psychology, right? You want to be able to give people guidance in the course of their work, rather than have them memorize a set of rules and try to remember to apply those. If you had a specialist group, you can kind of control and force them to memorize and then apply. But the more modern approach is to say look, with some of these machine learning techniques that we have, why don't we make a recommendation? Hey, what you're going to do is introduce bias into that calculation. And we're capturing that information as you use the data. Well, we're also making a recommendation to say, hey, do you know you're doing this? Maybe you don't want to do that. Most people who are using the data are not bad actors. They just can't remember all the rulesets to apply. And so what we're trying to do is catch someone behaviorally in the act before they make that mistake and say, hey, just a bit of a reminder, a bit of a coaching moment. Did you know what you're doing? Maybe you could think of another approach to this. And we found in many organizations that changes the discussion around data governance. It's no longer this top-down constraint to finding insights which frustrates an audience that's trying to use that data. It's more like a coach helping you improve. And then the social aspect of wanting to contribute to the system comes into play and people start communicating, collaborating in the platform and curating information a little bit. I remember when Microsoft Excel came out, the spreadsheet or Lotus123. Oh my God, people going to do these amazing things with spreadsheets, they did. You're taking a similar approach with analytics, much bigger surface area work to kind of attack from a data perspective. But in a way kind of the same kind of concept of, hey, you know, put the hands of the users have the data in their hands, so to speak. Yeah, enable everyone to make data-driven decisions, but make sure that they're interpreting that data in the right way, right? Give them enough guidance. Don't let them just kind of attack the wild west. Well, looking back at the Microsoft Excel spreadsheet example, I remember when a finance department would send a formatted spreadsheet with all the rules for how you used it out to 50 different groups around the world and everybody figured out that they could go in and manipulate the macros. And deliver any results that they want. And so it's that same notion, you have to know something about that. But this, I mean, in many respects, Stephanie, you're describing a data governance model that really is more truly governance. That if we think about a data asset, it's how do we mediate a lot of different claims against that set of data, so that it's used appropriately, so it's not corrupted, so that it doesn't affect other people. But very importantly, so the outcomes are easier to agree upon because there's some trust and there's some valid behaviors and there's some verification in the flow of the data utilization. And where we give voice to a number of different constituencies, because business opinions from different departments can run slightly counter to one another. There can be friction in how to use particular data assets in the business, depending on the lens that you have in that business. And so what we're trying to do is surface that those different perspectives, give them voice, allow those constituencies to work that out in a platform that captures that debate, captures that knowledge, makes that debate a knowledge of foundation to build upon. So in many ways, it's kind of like the scientific method, right? As a scientist, I publish a paper. Get peer reviewed. Then get peer reviewed, let other people weigh in. It becomes part of the canon of knowledge. And it becomes part of the canon. And in the scientific community over the last several years, you see that folks are publishing their data sets out, publicly, why can't an enterprise do the same thing internally for different business groups internally? Take the same approach. Allow others to weigh in. It gets to better insights and it gets to more trust in that foundation. And you get collective intelligence from the user base to help come in and make the data smarter and sharper. Yeah, and have reusable assets that you can then build upon to find the higher level insights. Don't run the same report that 100 people in the organization have already run. Stephanie, final question for you. As you guys are emerging, you started doing very well. I guess have a unique approach. Obviously, we think it fits in kind of the new guard of analytics and productivity worker with data, which is we think is going to be a huge persona. Where are you guys winning and why are you winning? And with your customer base, what are some of the things that are resonating as you go in and engage with prospects and customers and existing customers? What are they attracted to? What do they like? And why are you beating the competition in your sales opportunities? I mean, I think this concept of a more agile grassroots approach to data governance is a breath of fresh air for anyone who's spent their career in the data space. We're at a turning point in industry where you're now seeing chief decision scientists, chief data officers, chief analytic officers take a leadership role in organizations. Munich re-insurance is using their data team to actually invest in whole new arms of their business. That's how they're pushing the envelope on leadership in the insurance space and we're seeing that across our install base. Elation becomes this knowledge repository for all of those minds in the organization and encourages a community to be built around data and insightful questions of data. And in that way, the whole organization raises to the next level. And I think it's that vision of what can be created internally, how we can move away from just claiming that we're a big data organization and really start to see the impact of how new business models can be created from these data assets. That's exciting to our customer base. Well, congratulations, hot startup. Elation here on theCUBE in New York City for CUBE NYC, changing the game on analytics, bringing a breath of fresh air into the hands of the users. A new persona developing, congratulations. Great to have you. Stephanie McFrannells, it's theCUBE. Stay with us for more live coverage, day one of two days live in New York City. We'll be right back.