 Live from the San Jose Convention Center, extracting the signal from the noise, it's theCUBE, covering Hadoop Summit 2015, brought to you by headline sponsor Hortonworks, and by EMC, Pivotal, IBM, Pentaho, Teradata, Syncsort, and by Atunituan Disco, now your hosts, John Furrier and George Gilbert. Hello, everyone, welcome to theCUBE. This is SiliconANGLE Media's flagship program. We go out to the events and extract the signal from the noise. We are here live in Silicon Valley in San Jose Convention Center for Hadoop Summit 2015. We've broadcasted every single Hadoop Summit since it's been in its inception, when Hortonworks spun out of Yahoo, and it's been a great industry conference. We're going to hear for three days of wall-to-wall coverage with theCUBE. I'm John Furrier, the founder of SiliconANGLE. I'm your host, George Gilbert, our big data analyst at Wikibon, and our guest analyst today, CEO, founder, former practitioner at Bank of America, Abhimeda. Guys, welcome to the kickoff segment for Hadoop Summit. Again, live in Silicon Valley on the edge, the big data space, we've got Spark Summit coming up next week in San Francisco. So much going on in the industry right now. Hadoop Summit is where all the action is. We'll be covering it to you, but guys, I got to ask you perspective. I mean, the analytics business. Hadoop has been created a huge innovation cycle, but are we stalled? Are we in transition? The word, transformations up on stage. Rob Beard, the CEO of Hortonworks, talking about transformation. Enterprise is ready. It's almost like rah, rah, you know, kumbaya, Hadoop. We made it, we're making it. It certainly has made it, but a lot's changed. Are we in a shifting of the tides, of the winds? Abhi, what's your take? You're on the ground floor, you're at a company you're running, self-funded triseta, doing very well. Thank you. No outside capital. No outside capital. As you're talking to customers, you're getting paid for your services, so you are getting paid for the value, you're creating value. What's it like on the front lines right now? With Hadoop and the big data ecosystem, are we in transition? Is it evolving? Is it growing? Is it shifting? What's your analysis? The sense we get, John, first of all, it's always good to be back. It's good to join a really cool crew, looking sharp as always. Thanks for dropping the tie. I get a hard time from my friends. Whenever you wear a tie, I get a hard time. I never wear a tie. The analysts wear the tie. I'm sure they can tell you. But what a seminal moment. We have known each other for almost six years now. And the first time we spoke Hadoop, there was one company trying to commercialize it. The second company, which has now since gone public, I think this is the first queue at Hadoop Summit since Hortonworks has gone public. And what a huge sense of achievement for them, for the ecosystem. Mancudos to Rob and his team to have taken a very, very interesting player all the way to the ultimate culmination of innovation as a new enterprise going IPO. We are definitely, I would say we've hit the wall. I would not say that we have hit the peak of innovation. I think we're shifting errors. We're shifting the errors of going from the building blocks of data analytics to what I call the new era of automated intelligence. It's a less scarier explanation of the term AI that a lot of people have since spoken about. So you're definitely seeing the early starts just like we predicted six years ago. We're seeing the start of the next industrial revolution, data being the raw material. I think this is the next era in that same revolution where massive amounts of automation will enable every single industry, every single company, and every single player in the ecosystem to automate hitherto complex, manual, and human processes. That is the ultimate pinnacle of this revolution. Well George is new to the team, SiliconANGLE Media, Wikibon, and also Big Data, but he wrote a really amazing paper around systems of intelligence, something that we've been ticking around. You put a box around it. IBM calls it cognitive computing, automating intelligence. This is the future. George, I mean the vendors here, I mean we've always been saying consolidation, consolidation, consolidation, but still growth. Not necessarily negative growth, consolidation, yard sale, it's evolution. So I want to get your take on one, systems of intelligence, what you mean by that. And then your outlook for this industry right now because it seems to me there's a wind blowing, there's a smell in the air of money and growth. Yet the vendors are jockeying for position as the NASCARs and the cars on the track. Who's got the drafting, who's leading, slingshotting back and forth. We seem to see players pimping themselves up to be bought, some are positioning for IPOs, some are trying to figure out their future. So talk about systems of intelligence, that vision, and where the players are, then we can discuss who's winning. Okay, it's kind of a perfect segue for me to paint a broad picture of systems of intelligence and Abby's a great guest because he can tell us about what he's built, which fits right into that. Systems of intelligence, the way we see them, are natural outgrowth of the original systems of record like ERP systems. That became systems of engagement when you had a consumer quality AI that sort of gave the user an immersive experience. The systems of intelligence come in when you can bring to bear the analytics in real time to anticipate and influence what the consumer might do. And you can also spin it around backwards and look at internet of things and respond in real time. But where we are in the platforms to support that, platforms always come together out of a fragmented set of pieces where one or a couple of vendors will bring those together in an integrated hub. And right now, we look at, let's say the three major distro vendors, MapR, Cladera, Hortonworks, and they all talk about having this data hub, but if you open the covers a little bit, it's really not a product, it's an ecosystem. And the way they're building out these ecosystems, all these Apache products, they're dumping essentially a lot of complexity on the administrators and on the developers. And that's retarding the growth of this. And it's companies like Abbey that's sort of curating the pieces. And to some extent actually opting for alternative platforms. Well, let's talk about that because Abbey's company, Traseda, okay? When we just last year, watching our segment coming in, driving in from last year, we talked specifically about enterprise softwares change. How it's developed, delivered, and sold. Completely radically changed. You addressed that. Go to last year's video, Abbey Metta, to get a real deep dive on that. But that is changing the application business. So one of the challenges that we see here at Hadoop Summit and through this past year throughout our many CUBE interviews is that the application developers, AKA the workload is in charge. So as an entrepreneur, you have investment in building software, enterprise software. When you walk in the front door, while you're cutting the line, if you will, right to the ivory tower, you've got the big banks and retails because of your software, you got to deal with a lot of legacy stuff to integrate. So do you write that software? Does the customer write the software? That's friction in my mind. So comment on that. And are we stalled waiting for cloud to catch up? So I mean, I love the vision of, that's automated intelligence, but if you can't get the data and you got to build connectors to some legacy systems, whether it's systems management, some storage. Now you got Flash and Spark. I mean, there's so much software being, that needs to be written. You can almost spend the year writing software just to get up and running. So what's your take on all that? Excellent point. Let's start with the vision. The vision at Triseta has been quite simple. In fact, I was reminded of it talking to the CEO of a bank here in the US a few weeks ago. And he said, why is it that when I was searched for a restaurant or a movie to go to, I can power up this very simple interface, type up a query that, exactly how I would speak out language and get an answer. But when I have to go seek an answer to, how much deposits do I have in a particular market? What was my customer service ratings from my customers for a particular region? I can't get an answer. It seems like an act of God almost for large enterprises to seek clarity from the mountains of data. Why can't I solve that problem? That in a nutshell explains the vision of Triseta. There are three underlying components that need to be enabled, created, curated, or put together depending on which part of the ecosystem you're in for actually making that vision happen, which is transforming enterprises into intelligence engines where anybody in the enterprise, anybody, the CEO, the person watching the gate when someone walks in knowing that this person could be a bad actor or not, that person in risk or fraud can get answers to their questions in a very easy interface. The first building block for that is what you call cloud. And at this point, I've stopped calling it cloud because I think there's a, as I was saying before we start the segment, there's probably a greater chance at this point to convince humanity that there is one God and there is to convince them there'll be one cloud. But infrastructure as a service, where I refer to it, to enable applications that have automated complex human process. Question you asked me yesterday is analytics a process or a product? I told you my answer. It is the automation of complex human processes. That's what analytics ultimate vision and ambition should be. You, it is very hard to your point, John, to deploy those intelligence engines, automated intelligence engines in large enterprises when the infrastructure isn't ready for it. It is not just friction. It is a, it is a cost, a tremendous cost to speed to market. You cannot deploy analytics and intelligence today or seek to deploy it today, wait for 12, 24 months for it to get deployed because by that time, the consumer's already gone. So that's the first building block. You have to be able to be able to an open intended dock your application seamlessly into multiple and viable environments. Both private infrastructure as a service and public infrastructure as a service. And it has to happen. So two questions for you guys. One, one came up from Courtney Anders who works for platformer yesterday on the crowd chat we had with Cisco at Cisco Live Analytics. And she said, how real is the concept of citizen data scientist? Is anyone seeing this trend? It's kind of a softball. So hold on for that for a second. Then the other one was when we dropped the word big data from big data. So thoughts on that. I mean- You should never drop the words big data from- I always said- Never drop the word big from big data. Big money, big data. It represents something so fundamentally different John. Sorry to interrupt you, but that big data for me since day one represents two major changes. A Darwinian technology evolution that is fundamentally transforming every single piece of the enterprise technology stack. From storage all the way to PR. And secondly, there is a voluminous trend around data that allows us to understand the citizen data scientist themselves much better and arm the citizen data scientist to take action very quickly. Which you cannot represent simply with the word data or BI or any new word we may come up with. I think people has become sexy and fashionable to bash the word big data, but 6,000 people wouldn't be here. But for that word, which fundamentally represents a Darwinian shift in the technology ecosystem we grew up in. So the cloud, you mentioned that's going to be there. This brings up the internet of things conversation. You saw Rob Bearden with Hortonworks. He looked like John Chambers up there. He was delivering really confident. He had spring to a step, really connecting with the audience. I thought Rob Bearden's keynote here this morning was fantastic. You know, you're seeing the internet of things. He brings that up with sensors. Is the internet of things important with Spark and other technologies around the corner? How hyped up is internet of things? I mean, first of all, you know my feeling. You got the eye watch, mine's on its way here, but you are seeing human aspects of internet of things. Humans are things, machines are things. This is a really big hot area. Yes. Thoughts? Trusita, I personally, you as well, we have never changed the buzzword. You asked me a question last month that was here on your show and you said, what do you think about machine learning? And I said another buzzword that everybody's chasing. Fundamentally, we all agree and we have to take it back to business value. There is tremendous business value waiting to be unearthed, unleashed, and discovered from the internet of things, people, devices, robots, and intergalactic equipment. Whatever you want to call them. It doesn't matter what label we put towards it. The question fundamentally becomes, is the question we ask ourselves at Trusita and other participants in the ecosystem, Clodero and Hortonworks together have done something quite amazing which we have never seen before in our evolution or my lifetime of technology. Not only have both of them become the fastest growing enterprise software companies in history, they have shortened the timeframe to lay down the foundations, the groundwork of a wave of innovation that still is yet to be unleashed, right? The Darwinian moment, the Galapagos has been built, but the reproduction hasn't begun yet. Why? The question becomes why? And the answer is quite simple. If you go back to business value, delivering business value, irrespective of IoT, IOP, data lake, data oceans, you know, a lake, a term I never liked, I liked data oceans a lot better, irrespective of the buzzwords, the terminology and the tooling, what I go back to is, you look at Red Hat, and Red Hat after 15 years is worth $10 billion of market cap. Whereas, you know, a Decker corn in today's terminology, you know, what I call the centaur, the $100 billion companies were the IBM and Oracle. IBM and Oracle were concerned and Red Hat emerged as a trap. What we all have forgotten is that the market value that IBM and Oracle garnered because of Red Hat's implementation of a vastly superior and distributed infrastructure dispersion model enabled them to create hundreds of billions of value. Hordeworks and Clarera, I'm sorry, Hordeworks and Clarera have already become billion dollar companies. They absolutely deserve the valuations. We have three minutes left. The question is, where are the $100 billion companies in this ecosystem? I know you want to say that. And that's what we come up with. So I want to get a couple more points and I want George to chime in on this. $100 billion companies, you mentioned Oracle, IBM, the big whales, I mean, Azure's doing great. So what are the stars? We were just commenting about Databricks and these other companies that are out there where, you know, it's hard for these guys to survive when Amazon's going to just replicate their features. So two things, guys, how does the startup survive when the winds are shifting? That's an opportunity as well. And two, there's a huge M&A boom going on right now. Oracle's going to be a player. IBM's already a player. These guys are going to be writing checks. VMware is going to be running like the wind. Guys, startups, how do they compete? If not, consolidation certainly is going to be on the horizon. Thoughts, George? How did you go for it, George? Where we talk about the Hadoop distro of players as platforms and we talk about reaching sort of mainstream acceptance, crossing the chasm, if we remember, it requires that bowling alley success that one key economic application which we think is data warehouse offload. But we also recognize that that platform isn't fully mature and alternatives like Databricks are sort of appealing to customers who say, I want a much simpler developer and administrator experience that also happens to have greater productivity. Where do you see that in terms of affecting sort of the trajectory we're on with the platforms? John, I first have to say I'm quite disappointed. There are only three minutes left and I remember John Cleese getting equally upset. Yeah, get your water there. I've got my water over here. Too much to say in three minutes. I'm kidding, I'm kidding. That was one of the best ever interviews on theCUBE. Mine included, but what a good job you did with John. Excellent. That was so humble of you, yours included. Humility is overrated in the world, you know? Competition, only the best survive. So startups, M&A. Yes. Yeah, let's get into that. Here's my perspective. I'm a huge fan of any innovative ecosystem being developed grounds up, especially when it goes after enabling the delivery, the deployment, and the complete re-engineering. Actually, in some cases, rebuilding of insight in large enterprises. Databricks fits that mold. Huge fan of Databricks. And what they've done in Spark in months is what we so happen with Hadoop and Edge DFS ecosystem in years, which is just the base of innovation. To answer John's question, I've said this for a long time and I'll say it again. If you're building a company, if you're building a startup today in these tiers, BI, what was called BI, analytics, databases, ETL, and storage, don't. You cannot, they're free. They're tools and you cannot build companies on tools. You have to rise above the traditional analytics stack. Cloudera and Horton runs. That was true in the last generation for the most part. It was true. It was incredibly true for the last generation because think of it. What we're doing in this current revolution is automating the last remaining vestige of global GDP. Agriculture got automated, manufacturing got automated. This is the era to automate complex human processes, AK services, right? So the ability to go and sell tools is over. The ultimate BI platform is called a white page with a rectangular bar in the middle. You don't need more complex reporting or BI. So if you're building a company as a startup in those tiers, you're done, you're dead. Don't build a company in those tiers. So the advice you get is you've got to be able to find immense business value around human processes, everything from fraud to, you know, the internet of things to server management, transforming the way products are built, delivered and sold and priced in a way with technology and automation never seen before. And that is what we call advanced analytical applications. And we've done a very good job with it. So where are the hermaphrodites? So to your point about value, your company goes in, land and expand, you win on the merits of your product. You mentioned Databricks and the platforms of all the companies out there where they have great beach head in a narrow segment scope. And their goal as a startup is to sequence to a broader position platform, if you will. So if you're a company and you're out there, you're saying to yourself, hmm, I'm a feature right now. So you're always at risk unless you're highly differentiated. Absolutely. Amazon Oracle, these guys just build that out over time. That's how it goes. So the question is how do companies in this market use the benefit of open source to get into the market fast, create some value, and land and then sequence to a platform and or a broader differentiated position? Excellent question. There are questions, every time we build something at Truseta, we ask ourselves the question, what unique advantage will we have if we engineer a certain intelligent feature in the product? And if the answer is within in less than six months if there's a chance for it to either be open sourced or to sustain itself in a more richer way in open source, we will not build it. I'll give you a great example. Seeker on a Hadoop, right? Leaving us out of the fact that I think it's mating in a rat with an elephant and the results will not be pretty, it has become the lingua franca of Hadoop as well. If you want to query data in Hadoop, not do analytics, you've got to seek on Hadoop. So when you start thinking around feature versus functionality versus automating intelligence, three pivots of building companies around it, if you're in the first two, you have a very, very low chance of survival. So if you're building a company for security on Hadoop, for Seeker on Hadoop, for running open source algorithms in Hadoop, you will be consumed by open source. You will not build a viable business model around it. However, if you're building an intelligence engine to solve anti-money laundering in Hadoop, if you're building a intelligence engine to go automate armichile marketing at Hadoop, you've now delivered Day Ziva when you go live into an enterprise, tremendous business value to unearth intelligence and convert insight for customers. There is no other way. It's not that complex. You don't have to boil the ocean to get in. You can come in on a very narrow value proposition, develop value like you guys do and a lot of other practitioners do or technologists do. Absolutely. And then, but you have to go to the next level. You have to. So I got to ask you as more of an historic industry, historic industry visionary, as well as someone who's been in the space, what's revolutionary right now? I mean, like, pretend that I'm a VC or we're at a board meeting, our own partnership. We have a VC firm and we're trying to pick the winners. We talked about this last year, our VCs kind of don't get this new model. Some do, some don't. Most don't, but it's hard to pick the next unicorn if you will or as they say. What's the revolutionary big play? What's the big push for startups right now? This is one area where I love the success and you think there's a bubble going on in the valley. I love the success of the B2C companies. The proof in what will revolutionize enterprise software, the blueprint for the revolutionary enterprise software companies is in companies like Uber and Airbnb. For the design of the apps or for the service itself? Yeah, so for the ability to decouple hard assets from soft service where the customer gets value but you actually don't own the asset. A great example for it. In enterprise software, anti-money laundering is a, we launched something at Spark Summit. I think I mentioned it to you where we became the first anti-money laundering engine in the market to take a complex human process. On average, a large financial institution having four to 6,000 humans looking at suspicious activity reports to investigate a false positive that is thrown away from some seemingly inept software solution around AML. If you can take away the hard asset which is in this case people, technology and the operational centers, split that up into a technology framework where you're delivering a service without the need to build or own the hard asset, that's the future. So ability to take the actual process, split up, going on everywhere, right? You look at networking, the same thing. You've split up infrastructure from the service, from the software layer. The ability to do the same thing in enterprise software is going to be the future. And the answer will go and rely not in the traditional stack where you might keep going. Database functions, ETNO. They'll be in business functions. Fraud, risk, pricing, underwriting, management. That's where you have to look to automate those processes. And you got to be an expert in the field. George, I want to get your take next on your view of the world. Also you got a research agenda. What are you looking for here at the show and put that across industry wide? Put the puzzle out. What's your mental puzzle and what puzzle pieces are you gathering to put the picture together for your research? I guess I look at, interesting that Abby brought up Red Hat, that I look at the Hadoop distros as potentially like Red Hat enablers. But because of the economic model, they're not going to be 50 billion dollar companies, but they enable these new class of applications and sort of that's where the value adds going to be. So synergistic impact, not so much direct P&L. Yes, enablers, literally enablers. And it's not the latest whizzy machine learning algorithm. It's applying it to a particular domain that becomes the value. And that's where we may not see huge companies, but we'll see a lot of very valuable companies. So what signals are you looking for here at the show and in the industry as we go to Spark Summit next week and a variety of other events, big guys and little guys, George, what are the things that gets your attention? That's a good move by that customer or that's a good move by that company. What are some of the things that you look for, for the companies that are really poised to break out and be leaders and the ones that might not be? Put you on the spot here. Okay. Sort of the things that Abby's doing where you build on a platform that enables you to do these, you said it more eloquently than I can, but where you're applying this intelligence at the point of interaction. So it might be evaluating credit card fraud in real time. It might be handling omnichannel retailing or banking, but you need a platform that lets you apply customer profile information, for example, at real time with the analytics that inform and influence the customer's interaction. Well let's go there with 360 degree of view of the customer a term that I actually don't like because the customer's 360 view changes significantly every minute they take a step in real time, not in real time. So Abby, answer that question that George was putting together in terms of what signals do you look for companies, your peers? Now take your CEO entrepreneur hat off, put your cube analyst hat on, identify, what are the right moves, what are the right signals when someone's got a certain outfit on, got the right running shoes there, are they running fast? And then two, talk about this 360 degree view of data. Don't like the term 360 degree view either. You're not agreed too much. We need to have a fight like you and John had. We call it total view of customer. We call it exactly what you said, understanding customer behavior. So three things we look for. And this is three things not thinking us, thinking of myself as just hard. Thanks for pushing me there about to say that. Deep domain knowledge. Extremely deep domain knowledge around specific, large scale human processes. It could be anything. It could be the process of making coffee. It could be the process of managing fraud. It could be the process of delivering an offer. Second, what I call are you from the BC or AD era of technology? BC being before cutting and AD being after dug. If you're in the BC era, which is, if you started a company before Hadoop, came onto the scene, you're done. Not worth it. Don't even look at them, it's a story that's very, very hard to tease out. And lastly, the ability and the starts to automate intelligence. And what that means is a science that fundamentally does what Google has taught us so eloquently, which is big data and simple algorithms will trump the most complex algorithms. So any company that comes around and says we have the smartest AI engine, throw them out. Doesn't matter. Big data of the Trump small algorithms. We have to wrap real quick. Last word is analytics, a product or process? It is a process. Analytics is the ultimate, ultimate business revolution that will automate complex human processes using a lot of open source tools and provide answers for humanity that could not have been provided before. And the application could be the product of the analytics. So process first, product second. Not product first, process second. Okay, this is the cue. We went a little bit long on Abhi Mada and George because the insight is amazing here in theCUBE. Again, this is theCUBE. We're here for three days wall-to-wall coverage. We'll be live. We'll be right back here in Silicon Valley after this short break.