 Live from New York, it's theCUBE covering Big Data NYC 2015. Brought to you by Hortonworks, IBM, EMC, and Pivotal. Now your host, Dave Vellante and George Gilbert. New York City everybody, this is theCUBE and we're here at Big Data NYC as part of Strata plus a Duke World. We're at Pillars 37 just down the street from the Javits Center. Anthony Dayton is here, he's the Vice President of Products at CLIC, Data Visualization Company, Hot Company, Growing, Anthony, welcome to theCUBE. It's great to see you. Pleasure, great to be here. So we've not met before, I think this is your first time on theCUBE. That is true. CLIC maybe we've had on before, but give us the updates, share with our audience who you guys are, where's the growth coming from? The space? Sure, so as you said in the introduction, CLIC is a software company, we're focused in this data visualization space. Public company, the NASDAQ under the ticker symbol CLIC, about a half a billion dollars in revenue growing approximately 25% on an annual basis. We're really transforming the BI industry in terms of how we allow users to get value out of the data that they're collecting and managing. So one of the clear trends we see in the market and I think that's something you absolutely see here at Strata is that organizations are collecting more and more data. The problem we're focused on is, how do you make that data valuable from a user perspective? So there's a strong and robust set of technologies and growth around managing that data. Then the question becomes, how do we get that in front of users so they can make better decisions with it? And ultimately, where we're really focused is on how people make decisions with data, how they communicate those decisions, how they share those insights, and that's really the focus of the company. Well, and it seems that there's a, from a business user standpoint, a major push on doing things faster. I don't want to wait for somebody to build a cube. It's too slow, too cumbersome, too complicated. Just give me some data and it can iterate. Talk about that dynamic and what it means to your business and what's happening in your customer base. So if you go back even five or 10 years, the state of the art for how people made decisions with data is, they asked IT for a data dump, right? So it was like, there was this really long-winded, painful process where as a business person, I would do make a request for data, which I would then receive maybe as an extract file or something and then I'd have to figure out how to munch that data and make a decision. Or I could go back to a reporting organization and the analysts and have them build reports for me, the tried to answer the question. And typically that cycle time was measured in months. So we always used to sort of joke that by the time you got back the answer, you'd forgotten what the question was, right? So you'd get back this answer. Or the market. What was I asking about? And inevitably, right, you'd get back an answer and the market had shifted or your question had changed or more typically you'd get that answer and realize you had another question. Oh, I see what's going on with sales in the Northwest region. I wonder if I could break that down by zip code. And that began yet another month, month to month process of going back and forth. And so where click is really focused is how can we reduce that cycle time? How can we make the cycle time between question and answer maybe as short as a click, right? Maybe just let the user do it themselves. And that's driven the growth of this company. Citizen analyst, George. Okay, yes. But you're describing a problem that it goes beyond just the visualization because the way traditional data pipelines were made was they kind of sort of refined just enough data to answer just the questions that were in the data warehouse. So if you came back after working with a visualization tool and said I need some more information, you had to unravel that transformation pipeline, go back and get more data. So there's got to be more secret sauce to visualization than just I can quickly find out what I want to know. I mean, it's a brilliant question because it's exactly right. I mean, the problem isn't a data visualization problem only. Data visualization is an important part about how you make it engaging and attractive from an end user perspective. But there is a whole set of work behind that to make that data available. And in fact, in some sense, that was the problem five and 10 years ago. The problem five and 10 years ago is you needed this long string of work from the raw data to transformation of the data, staging the data, placing it in cubes, finally iterating into the report and delivering to the user. That process needed to exist because computers were slow and memory was expensive. The core technology innovation of CLIC was to recognize that memory today would be plentiful and very inexpensive. And so if that were true from a platform perspective, then we can build on top of it a really powerful engine which we refer to as our associative engine. If we could build a really powerful engine which created a unique visual experience from the user perspective and then tie that to a set of governance and manageability which sort of made the whole organization trust the data that's coming back. That could actually be a new platform, a new architecture for solving this problem. But that sounds like you started with the experience you wanted, the visualization, but you worked backwards through a pipeline that you're not really being explicit about so far that knows how to go get source data when needed. Exactly, and so exactly, or let me sort of frame that a little differently. Oftentimes people ask, is the secret sauce of CLIC in its visualization or in its powerful engine? Or both. Exactly, and so my answer to that is always it's the union of these two capabilities. It's the two together. Chocolate and peanut butter, the two great tastes that taste great together. I hate Reese's. But that associative engine is, well it doesn't comprise the entirety of the secret sauce. It's different than what you hear from others. So traditionally the model has been the sort of drill down metaphor. So the idea is you start with the universe of data and you start drilling down to the answer. And I liken this to trying to find an answer in the data by knowing where the answer is and finding a path to get to it. So I want to find the needle in the haystack. I better know where the needle is so I can find my way to it. And what you realize very quickly is that's not how we operate as human beings. As humans, we are naturally associative. You meet somebody, you can't remember their name, you try to place their face for the last time you met them, you remember the context of where you were, you remember the other people they were with and then maybe you get to the name. Or maybe you take a completely different path to that data. But the point is that your brain works associatively as human beings were naturally wired to take nonlinear routes through data. So the real core idea of CLIC was we could build an engine that worked that way. Then we could have a differentiated user experience, i.e. one that works the way we think. The problem was that in a traditional model with cubes and with this drill down metaphor, that simply didn't work. So we needed to invent a new platform technology to have that be true. Just to clarify that, sounds like the user experience for interacting with sort of end data is probably a traditional cube because that's familiar. But there's got to be another model separate from that that actually ties all the, perhaps, the data in all the different cubes and all the details behind that together so that when you want to drill associatively whatever the word you choose. We usually use the word CLIC. Crazy that may sound. When you want to click your way to the related data, there's a separate pipeline that does that. How does that get built? So in a way, you don't have to build that at all. And so the engine simply works that way and it's expressed through the user interface. So maybe a slightly better way to explain this is the corporate colors for CLIC are green, white, and gray. And that is not because we picked three random colors that we liked to look of, although they happen to look nice together. It's that they relate back to a metaphor in the product. So green items are selected. White items are what we call included and gray items are excluded. And so the idea here is that as a user, you click on things, you make them green to highlight them. And we will always show you all of the data. So let me give you, it's much easier if I talk about an example. So you're looking at the sales for your organization and you click on North America. We're going to show you the customers in North America, the products that they bought and the invoices that are outstanding. Make up some examples. But we're also going to show you the customers who did not buy in North America, the products they didn't buy. These are the gray values, the invoices which are not open. And of course, that gives you great insight. Maybe you say, I thought we had sold the ABC product in North America, but I see instantly it's gray. We didn't sell that. So that could be, and let me give you a very practical example. The NHS in the UK was looking at how doctors dealt with cataract surgeries. And there is a regimen of drugs that you give a patient when they have a cataract surgery. And they were looking for variances of that regimen across the doctors in the NHS. Because there's very different outcomes and very different costs for cataract surgery as an example. But they could not only see the drugs that doctors were prescribing, but they'd also see the drugs that the doctor was not prescribing. And it turns out that in this particular case, the great insight came from the not. It came from the place where there was something that the doctors were prescribing in one place, but were not prescribing in these high-cost delivery places. Anthony, who do you sell to? So it's a great question. I'd say this is a more general trend in information technology in general, and even in particular in the data and BI space, which is increasingly the business user is the buyer of this kind of technology. And decreasingly is it seen as a sort of central IT purchase. And again, this maps back to where we started the conversation, which is in the past, the way you answered questions with data is into a request into IT. And so it made sense that that would be the buyer of this kind of technology. Increasingly, users want to do it themselves. And so increasingly, they're demanding the purchase of this capability of themselves. Okay, and so how do I consume your product? Is it on my mobile? Is it on my laptop? Is it in the cloud? Great question. And the answer is yes. Okay. I always like to say, if you have a kid and you ask them whether they like chocolate or vanilla ice cream, the correct answer is yes. You get both. Okay, so you don't care. Well, and it's actually an important strategy from our perspective. So we see a lot of vendors making mobile versions of their product, and we think this is dead wrong. Because what we see users doing is traversing different form factors. So you may begin your work on a PC. Then you may transition into a scenario where you're using, for example, an iPad. And then you may be on the road and you want to quickly check something on your iPhone or your Android phone. You may want to have infrastructure that you're running your own data center. You may want to use cloud infrastructure. Both of those need to be okay. So our product automatically works across all of these different device types. So taking the example, the visualizations will actually refactor themselves based on the screen size and capabilities of the device that you're using to look at them. We think this is a kind of critical capability for next generation. Going back to the differentiation and secret sauce with that sort of invisible pipeline underneath that relates all the information, without going into specific competitors, there's another visualization company that as far as I know, they sort of pull the information for the visualization through the sort of equivalent of a straw. And so you can interact with the data, but you can't get back at where it came from. Is that a fair characterization? Yeah, so I might say it slightly differently, which is in general, most visualization tools we see underneath the visualization is a layer of SQL. Yes. So they need to express the visualization in the context of a SQL query. So I need to be able to take this bar chart to make up an example, and I need to make it expressed as a SQL query. And the challenge with expressing these things as a SQL query is it brings you right back to this drill down and filter metaphor. So the nature of a SQL query is select this, group by that and select these pieces of the data. So it's a filter and group by metaphor. And so you immediately lose details as you get into the data. In addition, what our experience shows is that SQL as an interface is typically quite slow. And so maybe there are certainly technologies we've seen that improve the performance of SQL, but we need performance that's truly split second. Our data shows are usability testing shows. If you force a user to wait even more than a few seconds, two, three, they stop clicking. And that's death for us. We want people to click. So the speed experience isn't just moving around the visualization, it's getting back to the underlying data. So this idea of aggregate to details is important, but it's even sort of surfing the data. Again. Surfing, being finding the associated data. Exactly. And answering that unanticipated question, because this is really the core of the whole idea, which is users don't know what questions they want to ask of the data when they go in. They may have a beginning hypothesis. I think the problem is in this region, but that step into the data prompts other questions. Oh, I see what's going on in this region, but what's going on with that product line? Is it true across other regions? Is there a particular sales person that's having a problem? Is that sales person representative across our, this region of sales people? So you don't begin and end your analysis through this linear drill down metaphor. You're actually, excuse my words, surfing the data. You're taking a nonlinear route. Drilling around. Yeah, and this is very analogous to how we surf the internet. You know, we don't come in the morning and say, today I will visit these four websites in this order, starting with the World Wide Web. No, you begin, you look at something that prompts an idea or a question, you go somewhere else, you link and you, and that metaphor we want to have true for data analysis. The other difference I hear in your messaging is the governance piece. And but it's interesting in that there's the pendulum of someone swinging away from centralized IT to the distributed world, but the end user doesn't typically individually care as much about governance. Isn't that the IT organization? Or maybe there's another compliance organization. How to square that circle for me? No, that's exactly right. And so I think traditionally people have seen governance as the opposite of data visualization, right? It's sort of like that. It's sort of the principle in the school, right? But I actually have seen almost the exact opposite. Users want to be able to trust the data that they're using for decision making. And that implies a role for governance and for lack of a better term, certification of data. This is my trusted source for this piece of data. Where governance breaks down is when it becomes an end in itself. And so I say to you, you can only have this data because there's data that users don't trust, but it's the best data that they have, right? So I might have a spreadsheet on my desktop and nobody should trust that, right? It's a spreadsheet, you can enter whatever you want. But if it's the best piece of data for that business problem, marrying that up with the governance data that IT has, that's a really powerful solution. So we see the governance problem really as a continuum from truly untrusted sources all the way to really trusted sources. And of course there's gray space in between. And we want to provide a framework for all of that data, right? So if it's not trusted, that's okay. That's yours. If it is trusted, that's great. That came from IT. And so IT can be happy. The user can be happy. And then they really create, I like to think of it often as a common language for talking about data. Anthony, what's your relationship with the Hadoop ecosystem? You know, the Hadoop vendor. But there's all this data you're analyzing, helping analyze all this data. What's, where do you fit? Yeah, it's a great question. So they're great partners of ours. And we have a great partnership with Cloudera. I'm recently certified on Impala, et cetera. More generally though, there's a philosophical alignment which I think is really critical. The great value of Hadoop is this concept of schema on read. So this idea that I as a data analyst don't have to define my schema until I'm looking at the data. Well, it turns out that maps perfectly to Clicks Engine. So in a way you could think of Clicks Engine as a schema creation engine, which can then go draw data from, in this case, a Hadoop infrastructure, apply schema for the purpose of answering a problem and present it to the user in a really high performance way. That's a beautiful marriage. Rather than having to define a schema, for example, in a data warehouse, then draw the data in, fix it, and then analyze it, we can really have a much more flexible model. It really is a shared vision. No schema on write. Okay, time for schema. Let's bring Click to the equation. Or to say it differently, it's schema on analyze. When you're ready to analyze the data, then let's apply the schema relevant for that analysis. And a great example of that is, we have a customer in King.com. So you probably know them as Candy Crush. Oh yeah. Right, so these are the guys that make Candy Crush. And they have a massive Hadoop infrastructure for collecting all the data that they get off these games. They make other games besides Candy Crush, although that's the most popular. And as you can imagine, it's a huge amount of data. Nobody's going to be analyzing that directly. What they do is they use Click on top of that, and they pull off important business questions. So perhaps they've launched a new game, they've added a new type of candy, and they want to be able to see is that new type of candy affecting people's success rate with the game. The whole name of the game with these gaming companies is you want to keep the user successful, but not too successful. So you prompt them to pay for the currency to make themselves more successful. And that kind of business problem is, you can almost by definition, that problem changes with every iteration of the game, with every new feature, with every new game, and they can't fix the data of the schema into a warehouse. They need to have that flexibility. In this case, Hadoop infrastructure, and then applying an analytic technology like Click on top of that. How do you guys think about your TAM? Your big company, most people may not realize, I didn't, you're over half a billion, probably on a run rate of doing closer to 600 million, you can see how you could get to a billion over time. How do you think about the TAM, the size of this business, the way to expand your TAM? Do you have to start building databases? What if you could talk about that a little bit? Sure, so look, we think that this is a very big market. And the more important point is that it's largely unaddressed. So, and I think it's a very challenging thing in particular in this forum, and in particular at an event like this, which is we're dealing with the rare cream on the top of people who are really capable and facile with data and technology. So, in a way, you go three blocks that way, and you're going to be looking at companies that have just barely gone, started down this road, for whom state of the art is 10 years ago technology. So, this is a big market, we've penetrated, if the market's this table, we've penetrated the edge, just the corner of it. So, we think it's a big market. You know, look, the other thing I would say is, traditionally the data analytics space has been focused as we talked about on IT, and users are considered end users, which I think is a pejorative, right? These are the dumb end user. I'm going to give you the reports, right? Because you're not smart enough to do it yourself. If we can open this technology up so that smart people can start answering questions themselves, then really everybody should have this technology at their fingertips, right? There isn't a person who wouldn't be relevant for this market. Again, a simple example here. Lush Pharmaceuticals, a customer of ours, you've probably seen their stores around Manhattan. You know, these guys use CLIC, but not only sort of at headquarters, down at the warehouse, and even at the store. And they're using it to give stores visibility onto their relative performance, cross stores, and even within products. And they can do things like see that a product performs well when it's placed on shelf next to another product, so they can change the layout of the store, or looking at their inventory levels to see what's turning and what's not, and then communicating that back to the store. And so it's a great example of, are those traditional report users? Probably not. But this is an opportunity to expand that market and address new kinds of users and use cases. This sounds like a little bit further down the maturity curve. Of course. Are some of the Hadoop vendors bringing you in to say, look, we've sold this company on the concept of the Data Lake, or ETL Offload, Adjunct Data Warehouse, help us show value to the customer by making the data consumable. Exactly. Is that the mainstream right now? Precisely. I always like to joke that you don't want your demo of your new Hadoop infrastructure to be a command prompt. Right. That's a boring demo. Right, here you go. We're done. SQL and slash. Along with four DBAs to manage six nodes. Exactly. Or more likely it wouldn't be SQL and it would be whatever the born shell is. But anyway, the point is, you don't want that to be your first demo. What you want is a compelling solution to a business problem. So you want to walk into the customer, you want to speak their language, and you want to show value in the context of their business problem. So if you're walking up to an automobile manufacturer, you want to talk about cars and inventory and channel, et cetera. You don't want to talk command prompt. And so you're precisely right. The relationship here is a new generation, a new mechanism of managing infrastructure for data, married with a new generation for how we make that relevant and accessible for users. Anthony, we have to leave it there, but really interesting story. Click, really appreciate you coming onto theCUBE, sharing that story. Hot company, really expect interesting things going forward. Congratulations on all the success and we'll be watching going forward. My pleasure. All right, keep it right there, everybody. We'll be back with our next guest. This is theCUBE. We're live from Big Data NYC in New York City, right back.