 It's the Cube. Here is your host, Jeff Crick. Hi, Jeff Frick here from the Cube. We are on the ground at the Western San Francisco. We are at the HBasecon 2015. We were here a number of years ago. We wanted to come up and get an update on what's going on. A lot of excitement and a lot of really interesting developments. We're joining in this next segment by our newest Wikibon contributor, George Gilbert. Hey, George. Good to be here, Jeff. And we also have a special guest, George Chow from Simba. Simba fits into a lot of interesting areas. So, George, I'll let you describe what you guys do, what your mission is. Sure. Thank you, George. Thank you for the time to speak with you guys all. Simba technology has been providing connectivity to basic databases for a long time. We've been working, providing ODBC and JDBC technologies for about 20-plus years. And in the last little while, we've seen a lot of resurgence in the database field with NoSQL and with Hadoop. And so we've been essentially working on stop to basically bring all the good technology that we've built in the last little while up to date. We joke a lot on the Cube that a couple years ago, if you told somebody at a cocktail party that you worked on databases, they might go talk to somebody else. But now it's a pretty dynamic space and a lot of interesting developments, especially over the last several years. So, for a couple years, starting, I don't know, maybe in 2009, it was NoSQL, you know, and NoSQL meant NoSQL. You said something earlier, interesting, that Impala, the MPP SQL query product from Cloudera, was a turning point. Tell us what that meant and what's followed in that train. Sure. I remember starting out at, you know, 2009, 2010, 2011. In that era, the whole data space was essentially, as you say, NoSQL. Everybody's statement was like, you don't need schema. And so when we took a look at the market, we said, you know, like, they're still in need because when we look at the tools, the tooling was all around SQL. And so when we, you know, came to the market, you know, it was a tough stretch in, like, 2011, for instance. But what happened was that, you know, we were making our pitches and in 2012, I mean, I think that I saw was really the watershed moment because I think it was, like, in October when Impala was launched. And it was, like, at the next strata, at the next strata, it was like January 2013, and suddenly you can see, literally, the entire floor, like, changed. Suddenly everybody thought that SQL was a good idea again. And then, to me, I mean, that really marked, you know, the starting line for everybody to bring their own, you know, SQL and PPP technology back. So tell us, what did that mean in terms of customers who had already invested in SQL data warehousing technology for decades? What were they doing now with Hadoop and SQL interfaces to Hadoop? Well, I mean, what I see is that there's now a rapid evolution. I mean, as you see, traditional data warehousing technology is built on SQL and this new generation of technology that is just adopting SQL essentially now will fit. And if you take a look at it, the whole data lake concept, for instance, you know, it's basically a hybrid idea, talking about something that, you know, previously was fairly, you can say, you know, feature poor like HDFS, becoming a lot more capable, because now not only do you have HDFS, you can actually put something, a scheme on it, you could apply, you know, any type of metadata for it. So just to pitch, you know, maybe one project that I'm involved in with the idea of Apache drill, for instance, that's what they're trying to do. What they're saying is that you can actually put the metadata in the file system alongside the data. You don't, you don't put it behind any walls. You don't, you don't even put it inside the meta store. I mean, you can definitely use the hive meta store. But if you take a look at drill, you know, drill allows you to put views inside the file. Drill being the map our sponsored product. What is, so what does drill allow you to do in terms of taking the best of traditional data warehousing products? And now the sort of flexibility of HDFS? I think Jack put it very well. Jack Nadal, the, the committer, I mean, I think he was one who coined the term, which I quite like, you know, which is punk sequel. Most people know sequel as he calls it like that, as mothers made it. So it's the, you know, good old fashioned sequel that you all know, the one you need to learn from school. But any of these sequel data stores nowadays, you know, are trying to obviously provide more capability. And so they do that by way of extensions and their extension. I mean, he coined the term punk sequel as, as basically what I would say, you know, syntactically consistent ways that look and feel right to somebody who is actually using when writing sequel. But what, and so tell us then if, if you were sort of had to do this schema on right, you know, everything organized up front before you even put the data in with a traditional data warehouse. How does this help? Well, this, this helps a lot because now you don't have to have that scheme or rather the schema can be fairly flexible because like, again, in the case of drill, you know, you're talking about, for instance, you can create definitions that are on the fly. In the file system, along with the files themselves. So that means that those files, you know, because they are alongside the data in the file system are accessible to anything. And so you don't have to worry about, you know, looking up the metadata because if you can get to the data, you can get to the metadata. So that means it's more self service. You don't, you don't have to go to someone and say, in your upstream system, you know, go mess with the, get me more data, fix the pipeline. It's like. If you can get to the file system, like literally in the case of Joe, that's what it is. If you can get to the file system, you can see the files, whether it is a TSV file or parquet file, whatever. JSON file, they're the most extreme. You can actually have the metadata right there alongside and you can rewrite that. You can open that up because of multiple views. It's basically literally like the ultimate self service there. Do you see them drill as as big a catalyst or change as Impala was several years ago? Yes, I mean, again, you know, for the fact that I've worked on it, I mean, obviously you can call call me bias, but definitely I see there's a lot of capability there. That is, you know, I would say my part is doing a fairly good job in terms of realizing. So yes, I would say it's really at this point, I think it's still an open race. I mean, there's a lot of good technology and a lot of good products coming together. And it is really the case that if you're looking, if you need an MPP engine today, you really have like an embarrassment of riches. Tell us about that embarrassment of riches. Why is that the case and where, how are they, what are their different sweet spots? Um, it's, I mean, the feel is in you and I'm not the best authority to give that the word. But I mean, in terms of sweet spot, I would say, depending on what you're aligned with, I mean, one of the big thing that I see is literally is the is the access. So some some of the engine are definitely aligned with some of the file formats and some of the storage. So, for instance, if you take a look at, like, for instance, impala, you know, probably, you know, prefers and runs best against things like parquet. So not to say that other engine, don't do it. But I mean, like, if you are, if you invest in parquet, you know, you probably have gotten started with impala and that's not a bad choice at all. And in the case of, let's say, you know, high, for instance, similarly, if you're invested in high, you may be looking at and you might have already a lot of investment in. Or C files again, you know, it's not a bad choice. And given the pace that the Apache high project has sustained, you know, I mean, that high business, as you can tell, you know, still a very viable project in many ways. You can call them like literally still the incumbent to be. And what about the so many other MPP projects, whether it's Hawk or Green Plum itself or Acton or Vertica? You know, we saw so many grow up in the pre-Hadoop world. Where do they fit? I see them as being commercial options that make a lot of sense for you if you have a particular need that they fulfill. A lot of it is literally, you know, what I see you as being comfortable with and whether that particular product has features that you like. So let's say in the case of Vertica, you know, if I recall, they have a fairly good projection feature. And you know, which means culling the columns. No, actually, if I recall the way they explained the way that it was that I recently refreshed myself was that whenever you if you need a query to if you want to optimize a query, you basically can build a projection set against that. And even though you may feel that is expensive, like, but I mean, this stage of stores, I mean, cost is actually not too great into performance that it gives you is actually pretty good. And, you know, it's like I said, it's still a viable option. So we're getting the hook. We're late on time. But before you came on, George said, you're the guy, you're like Switzerland and Penn Station. You're the neutral observer and everything goes through you guys. So from that point of view, and you talked about some really significant catalyst moments in the industry, some watershed moments, what are you looking forward to over the next six months, 12 months? I'm probably looking forward to, I think, HBase, you can say, making some big milestones this year in some way or shape. I mean, I've been watching HBase probably since about 2011. And this year, I mean, it's just special for the fact that there's so many engines and so many systems who are now coming together. And who are actually growing up and building on top of HBase. Can it be both a transactional store and an analytic store as well? Yes, actually, and if you take a look, there are literally are, I think, companies who are taking that approach on the offhand, if I recall. Who is it now? Trifodian or Financer? I think Splice Machine, for instance. Yes. Splice Machine, for instance, it's making a go at making a transactional system of it. Obviously, you know, Phoenix, you know, to some degree, you know, has quite a bit of transactional capability. When you take a look at what Microsoft is doing with their commercialization of HBase on Azure, I mean, you can see that as really transactional. Oh, good stuff. Well, George, thanks for taking a few minutes out of the conference to sit down with us. Appreciate it. And George, as always, good to sit down with you as well. I'm Jeff Frick. We're at HBaseCon. You're watching theCUBE. Thanks for watching. We'll see you next time.