 Welcome back everybody to the Stratoconference here in Santa Clara, California. I'm Jeff Kelly with Wikibon. We're live inside theCUBE, SiliconANGLE.tv's premier broadcast. We're winding down to the third day of coverage, but we've saved the best for last. Our good friend, CUBE alum, Lauren Swarch from TokuTech, welcome back to theCUBE. Oh, thanks for having me. Absolutely. So, you know, we were talking a little bit beforehand. Before we came on the air, one of the big topics here, of course, is to do, everyone wants to talk about the distribution news with Intel and Green Plum and some of the reaction from guys like Hortonworks. But there are other technologies out there, besides Hadoop, even though Hadoop seems to get a lot of the press for one reason or another. So, what you guys do is very interesting, really take MySQL, a very popular database, and really just supercharge it, allow it to scale, increase performance, MariaDB as well. That's right. Right, so we've had you on theCUBE before, so I think we understand what you guys are doing, but I understand you're also looking to move into new areas, new databases. You mentioned Mongo, which of course is a very popular database, but we talk about that a little bit. Yeah, yeah, sure. So, you know, our basic technology is how do you improve indexing for databases? You know, databases have used B-trees for, you know, 40 plus years, and it's been a very effective technology with, you know, very sequential information. But as the data has become much more random and varied, you know, how do you deal with that? So, our fractal tree indexing is part of our TokuDB offering, which, as you mentioned, is used in MySQL. It's been used by a variety of different customers now, and it really helps with the performance as the database gets larger, particularly as you go bigger than memory and you want to keep up with insertion performance, you want to keep up with lots of indexes, you want to have good compression and a good scheme of flexibility. So, we've shown the interest in that market, helped a lot of, you know, public and private companies go in that route, and then what we saw is, and the vision of the company is, how do we take this indexing technology and really help other databases? Anytime you can index something, you can get better performance. If you're going to come back to the data again and get that query, you know, ideal performance out of it. So, we've looked around and our customers were telling us that, you know, if you look at kind of, you know, the ubiquitous of MySQL, you know, what has changed over the years, you find that MongoDB is a very, very popular alternative database. People like it, it's very lightweight, easy to start going with, easy to go up for an application, easy to kind of scale out and build out that way, and it's very developer friendly and kind of the language in the way it's written. You don't have to worry about it and think about scheme as much. But at the end of the day, if you want to add indexing to MongoDB, you know, how can you get better performance? And they face some of the same limitations as MySQL as you look at something that grows outside of memory, what happens to performance? So we went in there and we started to do this initial testing, we're kind of in our beta phase and getting some good feedback from users out there of how do we go in there and, you know, replace the B tree with a fractal tree and what happens. And what we saw was, you know, very similar characteristics to the changes we made when we went to MySQL, we're able to get much better insertion performance on a large database, something that could be 10x or more, you know, it could be an order of magnitude or more or maybe even two orders of magnitude depending on the workload for, you know, query latency. And then we also saw compression as something that we're able to do. MongoDB by itself doesn't offer compression but because we write much larger blocks out to disk, you can actually get much greater compression when you put a fractal tree in there. So we've seen that we've gotten a lot of enthusiasm from the community from going in there. You know, our VP of engineering, you know, spoke at one of the Mongo events in Boston in the fall and kind of kicked off the interest and whatnot and we're getting a lot of user feedback now hoping to, you know, put that in the market in the summertime, so. So what are some common scenarios you're finding? You know, what are the first kind of roadblocks they start to hit when they start to look to maybe a solution like yours? What are other common times in the life cycle, I guess, of the use of something like Mongo where you typically see where they're starting to hit limits? Yeah. Can you kind of shed some light on that? Sure, it's usually the case and, you know, we're still kind of learning about the environment but it's the case where people kind of start a project or a small project, it's very lightweight, easy to use and they get going down that path and then, you know, the project's successful, right? And then they start adding more and more to the database and next thing you know, you're going out of the main memory when, you know, depends what it could be, it could be 30 gigs, 60 gigs, whatever the number is and you're all of a sudden writing out to a disk, whether that's a rotating disk, it could be a flash disk or something like that and that's where people start to see the performance bottleneck, you know, show up. It could be, you know, anything from, you know, the query latency starts to be a problem, it could be something around the, you know, how many inserts or how fresh can I keep the data? It could be a real issue. So that's where people start running into bottlenecks and start looking for alternatives. They might look for alternatives like, you know, if they're not using Flash, do I go there? They might look for, if it was a MySQL, they might look for a different database but people start wondering what do I need to do now? Do I need to go off of Mongo to maybe couch base or something like that? And we're hoping to provide a good solution just like we do with MySQL is, as you're successful as it grows, you know, keep the database and really get extra performance out of it. So could you apply this potentially to other databases? Is that kind of the business model, the business plan? That is the grand vision is, you know, knocking it down one domino at a time. You know, MySQL is a great, burly place to start, very open architecture, you know, ubiquitous use and they have a very flexible storage engine that you can just go in there and add things and be an alternative. So we've gone in there and done that. Good place to start, got customer validation, you know, built that out and then the next kind of domino, you know, very widely used as Mongo. But you can imagine and we imagine this being used in lots of databases, whether it's relational, non-relational, wherever there's an index, we'd like to see it be a fractal tree index. Well, interesting because, you know, again, as we talked about, while this conference, Hadoop seems to be the front and center, you know, you see, you've come across instances where I really illustrate the fact that, you know, we talk about, you know, Hadoop doesn't equal big data, it's part of the equation but there are certainly instances where it's not the best choice. We were talking about an example and I think it was Life Sciences in healthcare. So can you share that, if you would, with the audience? I think that was pretty illustrative. Yeah, yeah, we've been learning about this and it's very interesting. We've gotten a lot of requests over the past a couple of months for the genomic space for doing bioinformatics, gene sequencing. We work with places like University of Montreal on their gene sequencing and we've gotten interest from a number of others. And so I've been going to more and more events to kind of learn about, you know, what's important here, what matters and, you know, and I heard other people asking the question, well, how come you don't just use Hadoop for this? And apparently when you look in, at least that particular field, a lot of the computations that they do in the sequencing tend to be, you know, done more in series rather than parallel, right? So it's not by nature a very paralyzable task. And the opposite, and when you look at how their use case developed, a lot of them start with something simple like MySQL. Somebody starts with a little desktop thing and then next thing you know, I do lots and lots of people in the population and it grows big really quickly. So it doesn't fit that framework, but they still need extra performance as they go from their desktop to a large population. And that's where, you know, really good indexing can really make a difference. So of course we're here at Strata. So you've been here a frequent attendee. So I'm curious, so what is your take on the kind of the focus of the conference and the, I guess the demographics we're seeing? You know, I think if you were here three years ago from my perspective and it was a lot more, it was a lot of, a lot geekier conference. It was a lot of really hardcore data geeks. And now we're starting to see, in my opinion, a little bit more of, you know, we're seeing some of the bigger vendors here. It's a little bit more of a corporate, it's a little bit shift to a little bit more of a corporate focus, which is necessarily a bad thing. You know, certainly applying big data to business problems is important. But from your perspective, what are you seeing, you know, when people coming by your booth, have you seen kind of a progression of the type of attendees? Sure, sure, I think you're right. It has shifted a little bit. We used to see more of the DBAs coming by and administrators kind of deep in the iron there, trying to figure out what it meant for big data and how they were going to handle the plumbing there. But what's interesting here is, I think there's more and more people on the application side coming in this time. They really want to understand, what are the latest tools? How am I going to utilize them? Where am I going to put it in? And, you know, so tell me, how am I actually going to make money from this? Or how am I going to derive value out of my data? And it's interesting, because then you hear more of the real life use cases, right? And then you can see, where do you fit in as a vendor? You know, where does your solution come in? What could you do differently? So in some sense, you know, it might be a little skeaky, but on the other hand, to actually as a vendor to learn what's going on and to see how people are using it, that's pretty valuable. Yeah, so share with some of those use cases you've been hearing about both, you know, and your customers. I mean, you're dealing with, especially as you move into, Mongo for instance, very popular database, as you mentioned, building kind of web scale apps. So what are some of the things you're seeing out there? Maybe some of the anecdotes you've heard of the show. What are some of the interesting use cases? To really kind of start to get people thinking about the ways to really use and monetize the data. Yeah, yeah, so I think, you know, we see it a lot in a kind of traditional things, you know, that you hear a lot about where people are trying to do analytics on like online advertising and trying to figure that out. So that tends to be a popular area. I hear a lot of people talking about, you know, social graphs and social search and how to do that. So I hear a lot of interest in that area. And then, you know, there's just machine data, right? And how do you process that? You know, we work with places like NASA on some of these projects where they have a lot of machine data coming in from satellites. And you know, how do you keep that fresh and really be able to query on it pretty quickly? And I think that's the, at least from our perspective, that's some of the interesting things that we work on. And we hear use cases for is, you know, I have a lot of data I want to deal with, but I also have a lot of data coming in at that kind of high velocity, right? So how do I be able to maintain that at the same time be able to, you know, quickly turn around and turn that into something actionable, whether it's something, you know, on a website or a web application, or it's something, you know, for the government for an application they're doing and trying to understand, you know, this event happened on the satellite. What does that mean and how do I interpret it? So. Yeah, I mean, I think it's very interesting. I mean, from my perspective, one of the really last miles of this big data trend really is bringing, it's the applications and really kind of bridging the gap from, you know, okay, you've got the data, we're learning to process it better, restore it more effectively. And then, you know, data scientists and the tools they use are starting to improve and we're starting to see some really interesting insights from the data, but then you've got to take that and kind of put it into production and put it into applications that actually, you know, can repeatedly apply those insights for business value. So are you seeing, do you feel like we're making progress on that and where are we in that kind of life cycle of, you know, kind of seeing that the real promise of big data, getting the promised land, if you will, at the end of the journey where we really got applications and production that are really impacting business on a large scale across, you know, all various industries. Sure, sure, I think we're kind of at the beginning of it, at least from what we see kind of in our customers. They're really seeing, you know, what the value is when I have a large amount of data and I can do analysis on it real time and turn around report or tailor kind of what I'm doing. You know, we work with places, you know, kind of on the, like you take the online advertising, right, where they look at people coming in and they try to decide, you know, you've got, you just bought a black pair of shoes, a black pair of pants, and you're going to buy a black sweater next. Is that the next move? That's a combination of what did you just do on the site, just happened right now, combined with, you know, what do they know about you from before and what other data they have on other people. So you've got to be able to maneuver all that data and move around very quickly and keep up with that volume. And I think people are coming up with some, you know, we have customers coming up with really creative and new ways to deal with that. You know, like, for example, if you're a large, you know, online shopping site, maybe there are times when you realize that this person's not going to buy something so you can advertise something else to them from a competitor and sell the space, right? You've got to make those decisions in real time and it's something that, you know, I don't think it was possible a few years ago and now these tools are enabling just different ways of thinking about how to, you know, maximize every value out of the business. And kind of a related question, I was speaking with Pauline this from Intel earlier today, we were talking about, and she was making the point that really, there's not enough, she doesn't think there's enough business people here kind of trying to understand what you can do with data. And she even said, maybe we need a new conference to focus more on the business people. I mean, what is your take on the understanding of kind of the business at large about some of the power of these new tools and technologies and approaches that are being developed and, you know, sometimes we get caught in our little bubble in the big data world and share it in IT. And then, you know, we're on the coasts, we're in Silicon Valley or Boston in New York and, you know, there's a lot of space in between and a lot of industries that are not quite on the cutting edge, necessarily, of IT. So what is your sense of the understanding out there from the business side? And not necessarily, you know, the big, you know, obviously a lot of the big banks understand some of this stuff and the online retailers and these cases we were talking about. But what about some more, you know, traditional, even small, mid-sized businesses for who, you know, for who maybe the only time they really hear about big data is, you know, maybe once in a while on a, you will see a story will cross the wire or something. But really, it's not part of their day to day yet. I mean, do you feel like there's a long way to go in terms of educating business? I think that is true. I mean, a lot of our very, you know, kind of interesting users and interesting cases are people who are very IT savvy, right? They might be, you know, a small startup who, you know, the guy is the IT guy, he's the founder, and he really knows how to monetize this and where the values are and things like that. So, yeah, those are the people, I think, taking advantage of it. There's a lot to be done on the education of the business side. I think there's a lot of people on the IT side who see, you know, IT as kind of a cost center, right? How do I reduce costs on that? And we still try to play into that messaging. I mean, we reduce, like, you know, storage footprint with more compression. People understand, okay, my storage footprint is smaller, I save money. But that's not as interesting as, you know, as a business sale in a lot of cases, is, you know, how do I do something and what does this enable me to kind of do? And, you know, I think what works for us and, you know, at least in our small community is understanding, you know, and sharing the stories of what people have done and seeing the value of that. So when I go to go to these conferences, and, I mean, I've been busy meeting with customers and in our booth and whatnot, but I most enjoy the sessions where people, you know, give those use cases, right? Here's what I use, here's the value I applied, you know, here's what was useful there. And so more tracks like that, I think, you know, help spread the word of, like, I used it for this, I saw the value, it's a real use case, I use this tool and that tool, and I put them together. And that kind of makes it real, right? So it might not be a separate conference, but, you know, more tracks kind of focused on, you know, here's some real life use cases and stories. Right, which, you know, is exactly what we try to do, you know, on our peer insights, which we were talking about earlier, we're gonna have Toku Tech join us for one of those. So definitely tune in for that in about a month. But thanks so much for coming on, last question. So tell us, you know, what can we expect from you guys in the next six, 12 months? What's kind of on your roadmap? Obviously the MongoDB development, what else is kind of on your roadmap? Yeah, yeah, so that's, you know, that's a big open or a big effort for us, is, you know, how do we go there? How do we bring that to the market? So I think we'll be very, very focused on that. And then, you know, depending on the reception, it's, you know, the question becomes, you know, where do we go next? And, you know, we've tried little integrations, we've tried other things with like Cassandra, we've tried it with other, you know, major databases that are out there, you know, the bigger vendors are the more traditional relational ones. So I think we're trying to, you know, as we go in our stepping stones, just kind of figure out, you know, which one can we knock down next, where has the most interest? And we'll be keeping our eye open. We try to make the fractal tree indexing available by a Berkeley DB interface as well. So people can go in there and try and find new use cases. And I'm hoping that with people playing with it now, they'll come back and tell us, you know, we want to embed it here, we want to do something here. I had the conversation here with a place doing social graphing, and they said, hey, we'd love to try your index there. So, you know, maybe that's the next step. Who knows? That's fantastic. It's exciting times. Well, Lawrence, thanks so much for coming on. Lawrence Schwartz from Toku Tech, Vice President of Marketing. Thanks for joining us again on theCUBE. Appreciate it. And we'll be right back. Dave and John, and we'll wrap it up. All right, thanks for having me. And we will not be being joined by Dave and John, unfortunately, and this is live TV, so sometimes we have these moments. So this is a wrap from the Stratoconference here in Santa Clara. We've had three days of live programming. As you can tell, we're all pretty tired after the long three days, but it's been exciting. A lot of good content. I hope you've enjoyed the coverage. And of course, we'll see you at our next live broadcast.