 Hi, everybody. We're back. This is Dave Vellante. Big data after dark. We're here at the tug event. We're in Boston And we're catching all the entrepreneurs. We're seeing the VCs act in action tonight There's a lot of activity going on in Boston, you know Boston used to be a relatively, you know five, seven years ago In the in the tech space. It was kind of quiet You know, you had a big pharmaceutical push a big life sciences push Well folks enterprise tech is back big data is really the wave that everybody's riding And we're here with Adam Fuchs who is the CTO of Squirrel a company that we've been tracking You've been following some of the developments Adam. Welcome back to the Cube. Thanks, Dave So we were talking offline here. You have a new concept that at least at least it's new to me that That I'd like you to introduce to our audience talk about a little bit Yeah, so the thing I'm excited about these days is schema statistics schema statistics schema Statism that we know what statistics are what are schema statistics? Yeah, so you may have heard of this concept of flexible schemas, right? So this is you you bring it a data set you don't know anything about it But you want to analyze it alongside all your other data, right? So if you have a database that supports flexible schemas you can just throw that data in there It sits alongside it you can that query across that and all your old data at the same time, right? So that's flexible schemas Schema statistics is a tool that lets us do a little bit of modeling along with that All right, so traditionally if you're going to bring a lot of big data sets together You kind of have two different approaches you can take one is Put everything together in flat files and you know run MapReduce over it The other side is you know do a couple of years of ontology modeling to figure out how all the data fits together Right neither one of those is perfect on the one side You really don't have much idea of how things fit together on the other side You do a couple of years of modeling and then maybe your data fits. Maybe it doesn't right if it doesn't you start over If it doesn't you start over you do a couple more years, right? Schema statistics is middle of the road here, right? So we take data we throw it into a flexible schema database And then we ask questions about what's in there, right? What's the schema? What fields are present? You know how many times did I see each field? What are the categorical values? You know those types of things? So what's what the supports is the iterative data refinement cycle? so is it It sounds like a systems auto-didactic it teaches itself as it as it goes along Yeah, there's a lot of learning involved here, right? And this is kind of a base layer It's one of these discovery analytics that supports machine learning over Schema matching and those types of problems now last time you were on was at hack reduced. We talked about in concept Squirrel analytics, I know you can't talk much about it because it's an unannounced product, but but it's getting close I know you guys are very excited. We had Eli on earlier. He showed us a little leg So you got to be excited. This is your baby coming to life I'm really excited and we're you know right on the cusp of releasing version one of squirrel analytics And one of those analytics in there is schema statistics now. I'm really now talk about this more about this this technology Where's it come from? Is it is it out of academia? Did you guys invent it? That come from Google talk about that a little bit. Yeah, so it's it's a mix, right? A lot of these things come go back 20 30 years, but we're really bringing them together under this purview of big data So really a lot of this was incubated inside of the National Security Agency I we're looking at a lot of these concepts for bringing disparate data sources together Sources where we didn't control the schemas right a lot like if you're going to do a search of the web You don't control the format that people put on their HTML pages, right? How about let me ask you a question about patents in this world of open source, you know Sometimes the people forget about patents. What's your philosophy and strategy around patents? Have you have you you know filed a bunch? Have you filed a ton? Have you filed none? Talk about that a little bit Yeah, so we're we're just getting into the whole patent battle here It's interesting coming out of the federal government spaces patents are a little bit different Little different concept federal government actually can hold patents, which I found interesting. I didn't know that That's Yeah, my mind saying wait a minute. Yeah, they can't hold copyright But they can hold patents and they can actually request patent royalties But you know in the open-source space, that's very different right the people look upon patents in general as a defensive mechanism Right, so so we're looking at potential patents that we can throw on top of squirrel analytics And I think there are a few that were we might get in there But really, you know, we're looking at those as ways of protecting our own Mechanisms our own software. Yeah, right. I mean it's very rare that a small company uses them aggressively Although we have we have seen I don't know if we follow a company called clever safe who's actually begun to use its patent portfolio I think it's sort of a smoke screen, but I think that that's absolutely the right thinking is that defensive piece Yeah, absolutely, and you know, we're we're kind of in a good situation Where we're drawing a lot of our technology from Google and Google sort of has a patent umbrella that actually covers some of this technology Which is which is nice for us. I mean there their whole do-no-evil motto It's a good thing to build on top. Yeah, so we make friends with them. Maybe develop a little IP on your own Just share some of that with the community and and then, you know, try to make some money while you're at it you know, I think at this point we're at the cusp of a lot of new ideas in big data processing and These things are so early on Your patents are gonna be worth something but I think a lot of this technology has yet to be developed So we're really excited about the pace of new technology and yeah I mean we were talking to Eli before about Sort of, you know that the the transition point that we're in a lot of people kicking the tires in big data You know playing around with various technologies H-bases obviously one of them But looking forward and seeing wow, we may need something more robust something more scalable something more secure That's obviously where you guys play you're going hard after the developer community Obviously, you've got to to get traction in the marketplace Maybe talk about that initiative a little bit. Yeah, absolutely. So first of all, I'd like to say that Squirrel analytics is really trying to be at the nexus of security Scalability and adaptivity. All right, so we're trying to really bring all those things together I want to anecdote I can tell is that if you build a secure system Maybe it's not all that useful because if you can't ask questions about the data that's in there You don't get much out of it. So security on its own It's not all that useful when you couple it with scalability and with adaptivity Adaptivity in the sense of being able to build lots of applications on top of it that's really where you get a lot of Core capabilities in terms of big data processing. Well the thing I love I really love about what? The accumulo project has done is you've you've developed that fine-grain level of security Without the compromise on performance if you tried to bolt that on to an existing database What what would happen to the performance and scalability? Well, I mean the really the really tricky part there is that? If you try to model security at the same time as you're modeling your application Then you can model one application that way in the next application. You got a remodel security How it fits in there So the the nice thing about fine-grained access controls is that you can separate the modeling of the security and the application And it makes application development a lot cheaper on top of it Yeah, and it's almost as though, you know you guys are trying to position it as security or something You can just forget about I mean it's a and worry about all the other value that you can bring to the table Well, at least you can solve it as a separate problem. I which is it really I guess you can't ever forget about security Can we hope not I mean we're we're based on people not forgetting about security Excellent well listen Adam really appreciate you coming on. You're such a wealth of knowledge and it's always good to see you And we'll let you get back to the event. All right. Thanks, Dave. Let's see you again Alright everybody keep it right there, but right back with our next guest. This is the Cube This is Dave Vellante and we're live here at the tug event. Keep it right there