 Welcome back everybody. I'm Jeff Kelly with Wikibon. We're here inside the cube at Strata in Santa Clara I'm joined today by my colleague David Floyer, and we've got an interesting guest on today. We've got Lawrence Schwartz He's the vice president of marketing at Tokutek They are a company that's attempting to scale my sequel to big data workloads You guys are gonna be in the startup showcase. I understand this evening. So why don't you introduce us our viewers to Tokutek? Sure. So Tokutek is a company that looks at the big data problem from the database side. A lot of databases that you look at today are built on a relational model and as they get much larger and they start scaling out the performance doesn't really keep up and So a lot of customers will look towards alternatives may be going to You know, it can be more expensive hardware like flash or no sequel And what we do is we actually fix relational databases We fix the sequel model and we do that by going in there and addressing the core indexing technology The core indexing technology that you find in relational databases has been around for the last 30 or 40 years Based on a technology called B trees and we offer a real alternative to that called fractal trees And that's what we productize we make available today from my sequel and databases can scale up to terabytes plus And really have the performance they can do That people would otherwise leave the platform for and try something else interesting so because one of the Kind of the genesis of the so-called big data movement is the and as wikibon defines it is the Big data is those data sets that are too large or do their type that the traditional relational databases can't handle So there's the Hadoop model which is kind of a different model and then sounds like what you guys are trying to do is Adapt the relational model to fit into the big data landscape is that that's correct And a lot of people find that they'll start on a very common database like my sequel and they'll start small And then if their business takes off they start putting more and more into it and they like the model They like the relational capabilities. They like the indexing capability They like all the tools they have the expertise out there and then as performance starts to drop off It's only then that they start thinking what else can I do and as you know Sometimes I'll throw up their hands and say I've got to try something else completely different in some cases that makes sense But in a lot of cases if they could fix it and not have to do things like you know Partition their database and deal with all those management headaches if they could do this without having to you know Buy much more expensive hardware like flash drives or RAM Or if they could avoid going to a column store or something come very very different That might not have some of the other relational database flexibility. They would do that and we give them a real way to do that David, what is your take? What are some of the big names in the valley for example that might be using your technology to avoid Absolutely, so there's you know some very common cases out there where you look at like online advertising So we have one customer based out of New York intent media that looks at user behavior online during an interactive session And they might be in the you know looking at if somebody's bought black shoes black pants Are they going to buy a black blazer next and that's real-time data coming in they're trying to compare that against their existing database Get that blazer advertisement in there And so that's one way that you know the model gets used because it's kind of real-time interactive and they looked at alternatives And we talked about that with them of other ways of doing it And they settled with you know a sequel model another big player is a limelight networks or a public company And they've offered in the past few months. They came out with a cloud storage Offering and the unique thing about their cloud storage offering is that it has all sorts of quality of service guarantees for it And part of it is they're managing often times a lot of say video assets And I think a lot of metadata attached so they could have billions of assets of or metadata assets to manage And that's what we can help them do and again That's the real-time plus the combination of you know big database and and pulling stuff in and out of it So those are some of the typical plays there's a lot of chat around other types of database no SQL Things like that. Yeah What's what are the use cases that will push people towards you versus mango? Yeah, yeah, you talk about the the difference in the marketplace there absolutely What we find is that you know if you look at databases as they were designed In the past 30 or 40 years or so They were very good to handle all types of workloads You know LTP workloads Olap workloads read intensive workloads write intensive workloads And that was fine You know 30 or 40 years ago when you had very basic databases Maybe HR records very sequential data was coming in a very orderly fashion And then as data is you know as the actual underlying hardware got faster and the drives got faster And then you know you went to more online applications social media applications all of a sudden You know you people started breaking off the traditional model and you have these OLTP and Olap specialties, right? As well as some of these no SQL specialties So I think people will go and get frustrated today with the the relational model and they say okay I can't keep up my indexing for example So I'm going to give it up completely and just you know get the data in as fast as possible But if you think of it as like a library, right a library is nice and organized a Dewey decimal system But if somebody backs up a truckload of books to dump off It's very hard to take those books and put them in quickly and that's where people get frustrated They might go out an alternative the problem with some of those no SQL alternatives Is you're basically taking a lot of data and you're just dumping it in and getting it there And then you'll apply a lot of processing power to it later to get through it and for some workloads That's the appropriate thing to do and that makes sense But for workloads where you might actually come in and do repeatable queries Because whenever you have an index no matter what the technology is the index is always, you know 10 to 100 times faster So if you do have a repeatable, you know query at some point in time and you can stand up more indexes That's where we really have the value differentiation And and that's part of it, you know because we can insert at 20x We have our customers do 80x you can therefore keep up a much richer set of indexes So what are the one of the technologies that a lot of the startup companies have used? To solve this problem of large databases and a lot of the issues around long low-late Long latency is on the disk They get locking problems. They get issues like that. So they have used flash And people like fusion IO for example, they've used flash actually using atomic rights to flash As techniques for reducing a latency as low as possible Well, how would your solution compare with that way of doing things taking out taking that Issue away of the very slow database. Sure. Sure. It's a great question And it really gets around to you know what got me interested in the company because my background was in storage originally Is that when you look at how the data comes out to a disk today? If everything's nice and orderly and coming in, you know non-random fashion a spinning disk does well with it The data just kind of maps in well and all flows in there when you have more random interactive workloads You know now you're making the disk at jump all over the place, right? And the performance just goes down Exactly So what we do is instead of having a very static V-tree like a traditional database the factory is much more dynamic Rebalancing of the loads as they come in so as they come in it kind of pushes them down It keeps them at nodes it waits for more data that matches it to come into a node And then it aggregates it before it gets out the disk So by the time you're actually writing to disk you're writing out You're basically filling up the train right rather than sending out one car at a time So we get much more efficient use of the IEO to the disk so we get that advantage And then that also carries over to flash in that you know Flashes don't write to you know one bit at a time They kind of do a whole group of ones similar type of thing if you can get a bigger basket of data to come in When you write to that piece of flash then you'll get those additional benefits and get more performance Even out of flash as well. Okay, so one of the interests that I've had recently is that this whole area Of big data is both transactional and it's it's operational. It's it's analytic as well. Sure. Yes, so one of my posits is that as This technology develops the real endgame is being able to enable Organizations to design their analytics as part of their operational system Sure There'd be a huge number of barriers to that and I always being one of the major barriers, right So is Do you agree with that and to what extent can your technology? Enable that sort of vision. Yeah, it's kind of can enable it for very large companies Or is it maybe only just very small companies that can do this in a moment? Yeah, no, you're you've you know, you've got a you've got the great story that we you know We try to tell people and I think you've nailed it right there Which is you know a lot of people are going in the two different directions for managing this the two different environments And we let it we let people combine often times our application will start a more of an offline They'll try it for analytics and then they'll try you know combining it into their real-time environment And because they can handle you know, we can handle such a high insertion rate and stand up so many more indexes Now you can do both right you can compare us to the Traditional or not the traditional but the default storage engine that you get with my sequel. We actually are better on Better on read performance For doing different types of workloads scenarios and we're 20x faster on some insertion environments So you get the both of The best of both worlds in that you get the high amounts of reads the high amounts of rights And now you've got you know the use case that you're talking about where you can actually work on more analytics on Transactional data or the flip way of looking at right is transactions with with the with the large data warehouse And we've got you know, we've been able to do this with Demonstrated with you know tens of billions of rows. So we've got the scalability there We've got customers doing you know multi terabyte deployments today And it really comes down to how do you change the fundamental model of how you do a relational database? And it's pretty exciting our technology comes out of MIT Rutgers and stony Brook Some smart guys who have worked at some very big other algorithmic companies and have done a lot of research on this over The past ten years. So it's a very very new and fresh way of approaching it. So Where's tell you take going from here? So you're going to be acquired by one of these other companies so they can combine it all over. What's what's your what's your game plan? Sure sure well, we are we're growing We've been out for a few years and we've added more and more capabilities that people expect in a relational model So, you know over the years we added acid and MVCC and then we've gotten our in-memory performance on our latest release comparable to the basic default engine So we look alike. What's that Hannah look alike? So yeah, we've really built it so that people don't have to make a decision at least with my sequel So we're still a small company, but we're growing quite a bit and we want to be you know the default Engine that people use with my sequel. You know, I'd say that's kind of our media goal And long term we'd like to see this in all types of relational databases So, you know one question we're asking everybody who comes on the cube It's a very big picture. What is big data's potential to change society as our colleague John Furrier puts it Kind of taking a step back from the nitty-gritty of the technical details What do you see as the real potential of this movement both what you're doing and and some of the other players We're seeing here at at strata. What what can big data do for society? We also heard a lot of the keynotes this morning about the big problems Do you kind of expound on that a little bit? Sure sure, you know, we've You know, we see a lot of you know commercial uses today as I mentioned But I think there's a lot of interesting, you know research uses so we've talked with you know One company who's trying out our stuff and they're actually a research facility They work for the government and they're looking at all sorts of astrophysical, you know Uses and looking at the sky and looking the environment and trying to figure out where the gamma rays are coming from and where they're going to What's that mean for the planet and how's the universe growing? So it's kind of a very interesting much much bigger question. I can't say they're our typical customer But they get us pretty excited when they call in and they say, okay We're doing this instrumentation on this satellite And we're trying you with this phase and we have all this kind of oncoming machine data And it's kind of an exciting alternative use for us that tells us more about you know What we're going to buy next week, which is important and how we make money, but this answers a lot of the bigger questions All right, great. All right. Well Lawrence. Thanks so much for coming on sure Tokutek go check them out Tokutek.com Good luck tonight in the spotlight or I should say the startup competition When is that tonight o'clock it is it starts at 6 30 will be there at 8 o'clock and definitely come by Yeah, if you're if you're here on site the Stratoconference definitely check it out Again, thanks so much for coming on great And I think we're gonna take a short break and we'll be back shortly. Thanks again