 Okay, we're back live here at the Strata Conference at Silicon Valley. I'm John Furrier, the founder of SiliconAngle.com and SiliconAngle.tv. And I'm joined with Mike Cohoes. I'm Dave Vellante at Wikibon.org. This is day three for us at Strata, Wikibon.org. Has all the research. SiliconAngle.com is the news and editorial. So go check out those sites. Check out SiliconAngle.tv, where we broadcast all the videos. We've got a huge day today on top of yesterday's. And we've got a company called Calpawn here today. We do massively parallel columnar databases. We're going to talk to Jim Tomaday, who is the CTO. Jim, welcome. Thank you. Good to be here. Yeah, so this is a hot space. And help us understand sort of where you fit into this new emerging database sector. Sure. So internally we offer the scalability expect from MapReduce offering. We're actually mapped to a columnar storage that allows for some tremendous IO efficiency. So we do things fast. And part of our package is just an overall ease of use component, because the combination of the two technologies allows access to the data to be very easy. So, you know, we're very familiar with the verticals and the aster datas and the green plums. How do you fit into that space? You know, obviously you've got the Teradata's and the traditional legacy Oracle stuff, IBM. But you had this emergence of these sort of new MPP type players. You fit in there or are you new new? So we fit in there two degrees. We certainly see those competitors. We have a columnar storage. In some cases we're going to map to the IO efficiency you would expect of Vertica. In some cases the IO is going to be similar. Internally though, we actually have a different distribution of work model, rather than being a sort of a database on top of a database with the most history, running SQL and then that data being stitched back together. We actually translate the SQL into MapReduce internal to our engine, and then that workload is distributed across all the cores in a single server, all those cores in a cluster of servers. So tell us a little bit about the company. When did you guys start? So we actually, this effort's been in place since 06. The product itself was in basically benchmark testing in 08, launch in 09. Infinity B. Infinity B product, yes. Launch in 09 in our open source and then the commercial in the following year in February. So your thrust is in Hadoop? No, so our thrust really is really around a new solution for traditional analytics. So it's not necessarily a solution that's for unstructured data because columnar by definition is applied structure to it. But certainly there's plenty of use cases for structured data and we just make that operation much simpler. So it translates for us into web analytics solutions. We talked yesterday with the company that's doing genomics for livestock. So different applications come into play here. So how is this concept of big data changing the traditional enterprise data warehouse world? So it's certainly, in my opinion, it's certainly moving them off of the traditional platforms. That's for certain. So the general purpose database has not solved problems at two dimensions. So either at any access at scale there are better solutions and then additionally at analytics at scale there are better solutions and I think there's those two dimensions are basically stretching the traditional technologies. Can we talk about column stores for a minute? Maybe just do a little columnar 101 for the audience. Sure, certainly there's some confusing terms out there. So there's column family that really is a flex and access to an entity with a flexible structure and that's not what we are. So we're actually, one definition is sort of a true common database. We actually will divide each of the fields, each of the columns into separate stores that can be accessed independently. And so what that does mean is a couple things. You can access any query. We can use just the columns required for the query. So the IO characters are far lower than traditional systems. And then you can also, it's more flexible in terms of any columns that aren't part of the query just don't pay any penalty for those. And so it's easier to use from that standpoint. Okay, so you're saying that basically columns are better than rows in terms of IO efficiency? It depends on the workload. So as long as the workload is a large query against millions or billions of rows against the data set that's billions or trillions, it's a clear winner. For any kind of entity access the other is true. Row basis is better or column family or one of these other structures. Okay, so that makes sense. So what's your open source play? Talk about that a little bit. So we have a community offering that's out there that allows people to try the product. There's some reserve features into our enterprise. But we want people to be able to try the product. It's part of what's ready to let people know about us, certainly. We are also a MySQL storage engine. So that means that existing tools that connect to MySQL will also connect to our engine. Now internally we're a non-standard MySQL storage engine. We actually will take over the execution of the query and do all the joins within our layer rather than have that happen in the MySQL kernel. Okay, and so are you a software only? So we're a software only solution, yes. Okay, what does that give you? So just flexibility around implementation. So we are looking at even solutions beyond Intel architectures in terms of where we're going to run. How do we get to significant parallelism? In my opinion, that doesn't appear to be a solution that we want to build a hardware out because that tends to become stale quickly. So you're implying that you can scale better, be more portable? Yeah, so that's been demonstrated for us. We're very happy with both the scalability and the affordability of the system, yes. Jim, you got CTO and we want to apologize to Nick, your VP of marketing who's out there. Nick, I wanted to come on theCUBE, but... Never apologize to marketing. Dave and I don't like to give up the microphone, so we want to talk to the tech guys. But there's been a big tech conversation around performance. Obviously performance is great. Predictive analytics and real-time analytics are the hot areas everyone's forging into ahead of. But the database architecture in hardware has changed. We've had some folks on veterans in Silicon Valley, Scott Tetsin at Peer Storage, talking about all flash, no more spindles. So David Floyd was doing some research around IO centric architecture and infrastructure. And with flash and virtualization, it's changing how people are using the hardware side for maximizing performance and then the software on top of it. So how do you guys vector into that trend technically from a product standpoint and for all the people out there that are implementing these new flash architectures, they now can use commodity hardware, create multiple different kind of databases. What's your view on that perspective? So my opinion actually differs slightly. So I think that those technologies are absolutely differential allowing more IOPS into the system. But IOPS themselves are actually a solution for enemy access rather than analytic access, in my opinion. And so it's truly differential in that use case. But for ours, we're not IOPS bound in any use case in our engine. So the differential performance from minimizing the cost of an IOPS is small. And the other part of that is that certainly we're disk is the obvious enemy for performance. There's a solution there, but there's also efficiencies within processing in the CPU memory cache that we solve and that aren't affected by things like those technologies. Okay, so Avi Mehta was on yesterday talking about analytics. Dave, I can't recall specifically the exact quote, but he was saying something around how analytics is paired up. What was his comment around that? Do you remember that? He was, analytics is tied to the data is tied to the analytics. Were you people put the analytics as a big strategic decision? Do you remember what he said there? He was referring to the essentially the do best job versus the analytics activity and not making a choice between the two but actually trying to bring the two together is essentially what he was talking about. Is there a new analytics paradigm that people aren't seeing that you guys are, that you're seeing success in? I guess that's my question. There's a little different philosophies on how you do the analytics with multi-processing. I mean MPP, for example. Well, so I think this is the answer to your question, but I mean the end users are interested in the ease of use of the system, right? So it's a combination of features that allows for easy access to a high-performance system without understanding the complexity is easier to use for the market. And that's where we think we've got a nice edge because we map the SQL to our MapReduce layer transparently. So the end user isn't aware of all that processing, that it's being processed but underneath the covers it is. So one of the terms that we've found in our customers is that for Warner Music Group they talk about, we let them turn their SQL developers into big data developers. And so we think that transparency is interesting. What experience have you having when you go into these big, your clients? You got to go in, you got to do the whiteboard, you got to meet with the tech guys and they want to have that business conversation. Bill Schmarzo was recently on talking about the cultural change in these companies. And we've heard the theme earlier on, it's a talent game. And people are getting educated, but they're not idiots. I mean they're in tech, they're viewing SQL and spin around. Structured data is important. What are you seeing as the key ah-has for the clients and what are the stumbling blocks that they have to get over? Well, to go back to one of Dave's points, it's part of our open source strategy. So in many cases the tech guys get a chance to get in and get their hands on the product, understand the capabilities of the system and for us that's half the battle. So the rest of it is certainly forming a bond, making sure they understand walking through the architecture and things like that. All the traditional tools. But a key resource for us really is our access to open source capabilities. So there was a run on new next-gen enterprise data warehouse companies there. It reminded me of an NFL draft, John, right? Bill Belichick takes a left tackle and all of a sudden all left tackle start going, right? So there was a spate, you know, and a teaser. Absolutely. And Vertica and Astrodata and now it's sort of gotten quiet again. Yes. What's your take on all that? You must have been excited watching it. Well, so certainly these are interesting times. I agree with that assessment. So things I think have gone past that little bubble, right? Where people were excited and concerned that the draft pick has gone. So I think that's true. I think there are people making more seasoned decisions about acquisitions. But I believe that there are still places for differential technology to come to market. Okay. And you guys, so where are you at in terms of the company, the funding, headcount, things like that? So we're headcount just 23, 24, customer count, three dozen and moving. Active pool is much larger that we're really excited about where we're going in the future. We're in the middle of ramping headcount internally to do more partner opportunities. You know, extending and doing things exactly like this to let people know about the product. And how about funding? So we're well funded. We're actually private equity, privately funded. Great. My final question is we got on a time limit here on our next guest. We're talking about an application explosion. Obviously analytics is big. We had Mike Daberon and other VCs we've talked to here. Frank Artali once spoke on the phone yesterday with Ping Li from Excel. Analytics is really the hottest area. So what are you seeing in terms of the enablement of where analytics is going? Obviously it's the dashboards and all that good stuff. What are the key apps that you're seeing that you guys want to enable and what kind of benefits? Predictive, as you mentioned, is tremendous. That's going to allow for much more powerful use of data to deliver a difference for the business. So that's the leading trend that we've got. Some of the, you know, certainly visualization tools are critical. I'm not going to pretend to be the expert around those. I'm more of the backbone's guy. Yeah. What's your advice to your friends and colleagues and customers when they ask you? Hey, what's going on with big data? I don't know. What do I need to know about? How do I get started? What do I do? I have all this old tech. Well, so I have a 16-year-old daughter who's in the middle of career decisions. What to do? Well, I talked to her about there's certainly tremendous opportunities in data. So what's happening five years from now and 10 years from now with data is going to be just life-changing across a wide variety of industries. We've got a sense of it looking five or 10 years back, but I think the future is even more interesting. Okay. Well, thanks so much for coming on The Cube. Jim Tominy, thanks so much. You're in Frisco, Texas, not San Francisco, but it's in Texas. It's near Dallas. You guys love Texas. Mark Hopkins lives in Texas. We have a Silicon Angle office in Texas. Keon's there. Keon's there. And we love the Texas hospitality. Good old barbecue. You guys are great, friendly folks down there and good tech scene. So thanks for coming.