 Live from San Jose, California, it's The Cube, covering Big Data Silicon Valley 2017. Okay, welcome back everyone. Live in Silicon Valley for the Big The Cube coverage. I'm John Furrier, wiki bond analyst, George Gilbert. So I have Bruno Aziza, who's on the CMO of AtScale, Cube alumni, and Josh Clark, VP of Product at AtScale. Welcome to The Cube. Welcome back. Bruno, great to see you. You look great, you're smiling as always. Business is good. Give us the update on AtScale, what's up? Since we last saw you in New York. Well, thanks for having us, first of all. And yeah, business is great. We, I think, last time I was here on The Cube, we talked about the Hadoop maturity survey. And at the time, we just launched a company. And so now you look about a year out, and we've grown about 10X. We have large enterprises across just about any vertical you can think of, you know, financial services, American Express, healthcare, I think about Aetna, SIGNA, GSK, retail, Home Depot, Macy's, and so forth. And we've also done a lot of work with our partner ecosystem. So Hortonworks, OEM's AtScale technology, which is a great way for us to get to large scale across the US, but also internationally. And then our customers are getting recognized for the work that they're doing with AtScale. So last year, for instance, YellowPage has got recognized by Cloudera on the leadership award, and Macy's got a leadership award as well. So things are going the right trajectory. And I think we're also benefiting from the fact that the industry is changing. It's maturing on the big data side, but also there's a red definition of what business intelligence means. This idea that you can have analytics on large scale data without having to change your visualization tools and make that work for the existing stack you have in place. And I think that's been helping us in going on this. How did you guys do it? I mean, you know, we've talked many times and there's some secret sauce there, but at the time when you guys were first starting, it was kind of crowded field, right? And all these BI tools are out there. You had front end BI tools, but everyone was still separate from the whole batch back end. So what did you guys do to break out? So there's two key differentiators with that scale. The first one is we are the only platform that does not have a visualization tool. And so people think about this as that's a bug. It's actually a feature because most enterprises have already bets they've made with traditional BI tools. And so our ability to talk to MDX and SQL types BI tools without any changes is a big differentiator. And then the other piece of our technology, this idea that you can get the speed, the scale and security on large data sets without having to move the data, it's a big differentiation for enterprises to get value out of the data they already have in Hadoop, as well as non Hadoop systems, which we call it. Josh, you're the VP of products, you have the roadmap. Just give us a peek into what's happening with the current product. And where's the work areas? Where are you guys going? What are you guys, what's the do list? Where's the check box? And what's the innovation coming around the corner? Yeah, I think, I mean, to follow up on what Bruno said about, you know, how we hit the sweet spot, I think we made a strategic choice, which is we don't want to be in the business of trying to beat Tableau or Excel or be a better front end. And there's so much diversity on the back end. If you look at the ecosystem right now, whether it's Spark SQL or Hive or Presto or even new cloud based systems, the sweet spot is really, how do you fit into those ecosystems and support the right level of BI on top of those applications? So what we're looking at from a roadmap perspective is how do we expand and support the back end data platforms that customers are asking about? I think we saw a big white space in BI on Hadoop in particular, and that's, you know, we've really, I'd say, we've nailed it over the past year and a half, but we see customers now, they're asking us about Google BigQuery, they're asking us about Athena. I think these serverless data platforms are really, really compelling. They're going to take a while to get adoption, so that's a big investment area for us. And then in terms of supporting BI front ends, we're kind of doubling down on making sure our Tableau integration is great. Power BI is, I think, getting really big traction. Well, two great products. You've got Microsoft and Tableau leaders in that area. Yeah, they're really, like the self-service BI revolution has, I would say, has won, and the business user wants their tool of choice. Where we come in is the folks responsible for data platforms on the back end. They want some level of control and consistency, and so they're trying to figure out where do you draw the line? Where do you provide standards? Where do you provide governance? And where do you let the business lose? All right, so Bruno and Josh, I want you guys to answer the questions, be a good quiz. So define next generation BI platforms from a functional standpoint and then under the hood. Yeah, well, there's a few things you can look at. I think if you were at the Gartner BI conference last week, you saw that there's 24 vendors in the magic quadrant. And I think in general, people are now realizing that this is a space that's extremely crowded, and it's also sitting on technology that was built 20 years ago. Now, when you talk to enterprise, like the ones that we work with, like as I named earlier, you realize that they all have multiple BI tools. So the visualization war, if you will, kind of has been set up and almost won by Microsoft and Tableau at this point. And the average enterprise is 15 different BI tools. So clearly, if you're trying to innovate on the visualization side, I would say you're going to have a very hard time. So you're dealing with that level of complexity. And then at the back end standpoint, you're now having to deal with database from the past, that's the teradata of this world, data sources from today, Hadoop, and data sources from the future, like Google BigQuery. And so I think the CIO's answer of what is the next gen BI platform I want is something that is enabling me to simplify this very complex world. I have lots of BI tools, lots of data. How can I standardize in the middle in order to provide security, provide scale, provide speed to my business users? And that's really radically going to change the space, I think. If you're trying to sell a full stack that's integrated from the bottom all the way to visualization, I don't think that's what enterprises want anymore. Josh, under the hood, what's the next generation key levers for the tech and just the enabler? Yeah, so for me, the end state for the next generation BI platform is a user can log in, they can point to their data, wherever that data is, it's on-prem, it's in the cloud, it's in a relational database, it's a flat file. They can design their business model and we spend a lot of time making sure we can support the creation of business models. What are the key metrics? What are the hierarchies? What are the measures? It may sound like I'm talking about all that. You know, that's what our history is steeped in. Oh, faster data's coming, I mean that's streaming and Dutch coming together. So I should be able to just point at those data sets and turn around and be able to analyze it immediately. On the back end, that means we need to have pretty robust modeling capabilities so that you can define those complex metrics so you can functionally do what are traditional business analytics period over period comparisons, rolling averages, navigate up and down business hierarchies. The optimization should be built in. It shouldn't be the responsibility of the designer to figure out, do I need to create indices? Do I need to create aggregates? Do I need to create summarization? That should all be handled for you automatically. Shouldn't think about data movement. And so that's really what we've built in from an at-scale perspective on the back end. Point to data, we're smart about creating optimal data structures so you get fast performance and then you should be able to connect whatever BI tool you want. You should be able to connect Excel. We can talk the MDX query language. We can talk SQL. We can talk DAX, whatever language you want to talk to. So take the syntax out of the hands of the user and getting in the weeds on that stuff. Make it easier for them to get to the... And the key word I thank for the future of BI is open. We've been buying tools over the last 20 years. What do you mean by that? Explain. Open means that you can choose whatever BI tool you want and you can choose whatever data you want. As in a business user, there's no role compromise. But because you're getting an open platform, it doesn't mean that you have to trade off complexity. I think some of the stuff that Josh was talking about, period over period analysis, the type of multi-dimensional analysis that you need, calendar analysis, historical data, that's still going to be needed, but you're going to need to provide this in the world where the business user and IT organization expects that the tools they buy is going to be open to the rest of the ecosystem. And that's new, I think. George, you want to get a question in Edgewise? Come on. You know, I've been sort of a single issue candidate, I guess, this week on machine learning and how it's sort of touching all the different sectors. And I'm wondering, are you, how do you see yourselves as part of a broader pipeline, you know, of different users adding different types of value to data? Yeah, I think maybe on the machine learning topic, there's a few different ways to look at it. The first is, we do use machine learning in our own product. I talked about this concept of auto-optimization. One of the things that AtScale does is it looks at end user query patterns, and we look at those query patterns and try to figure out how can we be smart about anticipating the next thing they're going to ask so we can pre-index or pre-materialize that data. So there's kind of machine learning in the context of making AtScale a better product. Reusing things that are already done, that's been the whole machine learning demos. We saw at Google Next with the video editing and the video recognition stuff. That's been a huge part of it. You've got users giving you signals, take that information and be smart with it. I think in terms of the customer workflow, Comcast, for example, the customer of ours, we are in a discovery phase. There's a data science group that looks at all of their set-top box data and they're trying to make, discover programming patterns. Who uses the Yankees network, for example. And where they use AtScale is what I would call a descriptive element, where they're trying to figure out what are the key measures and trends and what are the attributes that contribute to that. And then they'll go in and they'll use machine learning tools on top of that same data set to come up with predictive algorithms. So just to be clear, they're hypothesizing about, let's say, either the pattern of users that might have an affinity for a certain channel or channels or they're looking for pathways. Yes, yeah. And I'd say our role in that right now is a descriptive role. We're supporting the descriptive element of that analytics life cycle. I think over time, our customers are going to push us to build in more of our own capabilities when it comes to, okay, I discovered something descriptive. Can you come up with a model that helps me predict it the next time around? Honestly, right now, people want BI. People want very traditional BI on the next generation data platform. Just continuing on that theme, leaving machine learning aside, I guess, as I understand it, like when we talked about the old school vendors, Teradata, when they wanted to support data scientists, they grafted on some machine learning a parallel version of R in the core Teradata engine. They also bought AsterData, which was for a different audience. So I guess my question is, will we see from you ultimately like a separate product line to support a new class of users or are you thinking about new functionality that gets integrated into the core product? I think it's more the latter. So the way that we view it, and this is really looking at, like I said, what people are asking for today is kind of the basic traditional BI. What we're building is essentially a business model. So when someone uses AtScale, they're designing and they're telling us, they're asserting, these are the things that I'm interested in measuring and these are the attributes that I think might contribute to it. And so that puts us in a pretty good position to start using whether it's Spark on the backend or built-in machine learning algorithms on the Hadoop cluster. Let's start using our knowledge of that business model to help make predictions on behalf of the customer. So just a follow-up and then this really sort of leaves out the machine learning part, which is it sounds like we went, in terms of with big data, we were first in archive that supported more data sort of retention than we could do affordably with the data warehouse. Then we did the ETL offload and now we're doing more and more of the visualization, the ad hoc stuff. Exactly right. So what, in a couple of years time, what remains in the classic data warehouse and what's in the Hadoop category? Well, so there is, I think what you're describing is the pure evolution of any technology where you start with the infrastructure. We've been in this for over 10 years. Now you've got Cloud Air going IPO and then going into the Data Science Workbench. That's not official yet. I think we read about this, at least they filed. But I think the direction is showing, now people are relying on the Hadoop platform in order to build applications on top of it. And so I think just like Josh saying, the mainstream application on top of the database, and I think this is true for non-Hadoop systems as well, is always going to be analytics. Of course, Data Science is something that provides a lot of value, but it typically provides a lot of value to a few set of people that will then scale it out to the rest of the organization. I think if you now project out to what does this mean for the CIO and their environment, I don't think any of these platforms, TheraData or Hadoop or Google or Amazon or any of those, I don't think do 100% replace. And I think that's where it becomes interesting because you're now having to deal with a heterogeneous environment where the business user is at. They're using Excel, they're using an extended application, they might be using some, the result of machine learning models, but they're also having to deal with a heterogeneous environment at the data level. Hadoop on-prem, Hadoop in the cloud, non-Hadoop in the cloud, and non-Hadoop on-prem. And that's, of course, that's the market that I think is very interesting for us as a simplification platform for that world. I think you guys are really thinking about it in a new way and I think that's kind of a great modern approach. Let the freedom, and by the way, a quick question on the Microsoft tool and Tableau, what percentage share do you think they are the market? 50? You mentioned those were the two top ones. Yeah, I mentioned them because if you look at the Magic Quadrant, clearly, Microsoft and Power BI and Tableau have really kind of shot up all the way to the right. Because it's easy to use, basically, and it's easy to work with data. I think so, I think from a functionality standpoint, you see Tableau's done a very good job on the visualization side. I think from a business standpoint, a business model execution, and I can talk from my days at Microsoft, it's a very great distribution model to get thousands and thousands of users to use Power BI. Now the guys that we didn't talk about in the last Magic Quadrant, people like Google, Data Studio, or Amazon QuickSight, and I think that will change the ecosystem as well, which again is great news for us. More muscle coming in. For you guys, just more rising tide floats all boats. That's right. For you guys, so you guys are powering it. That's right. With the modern BI, it'd be safe to say. That's the idea. The idea is that the visualization is basically commoditized at this point, and what business users want, and what enterprise leaders want, is the ability to provide freedom and openness to their business users, and never have to compromise security, speed, and also the complexity of those models, which is what we are running the business of. Get people working, get people more productive faster. In whatever tool they want. All right Bruno, thanks so much. Thanks for coming on that scale. Modern BI here in theCUBE, breaking it down. This is theCUBE covering big data SV strata dupe, back with more coverage after this short break.