 Live here in New York City with ground zero of big data. This is siliconangle.com's coverage of Strata, a dupe world. I'm John Furrier, the founder of Silicon Angle. I'm joined by my co-host. I'm Dave Vellante of wikibon.org. Check out Wikibon for all the free research where peers go, hit edit. We'd love to have your contribution. We're here with representatives from Google. Michael Manasheri is the developer relations engineer. Michael cruise around. He works with developers, helps them out, transfers a lot of knowledge from Google. Michael, welcome to the Cube. Thank you, it's good to be here. Let's see, at entangled Michael. That's it. Twitter handle if you want to follow him. So thanks for coming on. Great, it's good to be here. So this is a cool event. We were just talking off camera about all the contributions Google makes just by inventing cool stuff, writing a paper, guys like Doug Cutting, you know, reads it and says, oh, I could invent a dupe from that. So it's quite amazing to see that. Culturally, where's that come from? I mean, Google just loves to give back. Yeah, I mean, I would say we're forced to deal with the big data, like the volume and velocity of data that comes into Google is massive. And we're always forced to innovate new technologies to deal with it. We need to ask questions quickly. We need to get answers quickly. So we're just put in a position where we need to innovate quickly. So some of the stuff we do in five years, you'll see it become an open source project or come out later, because we've already solved this problem internally. And people figure out how to do it. They read our research papers. They want to figure out it themselves. But we're already solving these problems now. Well, thank you for that. There's a story on Wired magazine about Cloud Air, actually, and it talks about Google's work, how they enable the market. But I've been following Google, actually, since the inception when the founders founded it and watching the progress. And one thing about open source that I've noticed is that Larry and Sergey are little geeks who love to build their own stuff. I mean, they love to tinker. They're always talking about things that are just in there. You know, Google came out of a Stanford research project originally. These guys were like taking computers, gluing them together. They had a Lego storage system with their hard drives. And they build all of it. They build their own stuff. They hack. They build their own stuff. But they're now getting good at Chrome Brows is a great example. You've seen some of the work done in Android. Obviously, it's a success. But what's different is that you guys are already in the big data. We interviewed Squirrel, this hot startup that we like. We think one of the gems that are going to come out of the woodwork pretty soon. They came from NSA. Not NSA. Well, they had to build their own big data because they needed big data. So they had a specific need that they had to build on. And Google, too. It was to be a big table and everything else. Talk about big table and Google's vision of big data and what's going on in the market. What I think about Google internally is I never hear people talking about themselves as data scientists. Like, we're all just consumers of lots of data, right? Think of our servers. We have not lots and lots of servers. And we need to know the status of those servers. Somebody goes around saying, I'm a Google data scientist. Oftentimes, they just are consumers of the services we have internally, which is one of them is Dremel, which the product I work on is BigQuery. It's based on that. That MapReduce came from that as well. So we like to have these things as services. So we don't have to think about them. They're just built for us. And I think that's going to be the future of the big data scene. Let's talk about Dremel. Because Dremel is getting a lot of press. Explain Dremel to the geeks out there and the folks out there. Dremel is an ad hoc query system for massive amounts of data. The data comes out. The query results come back in seconds. But what's cool about it is it uses a SQL style language. So it's easy to write these queries. Sometimes you have to write a MapReduce to do these things as well. And using a SQL style language is expressive. It's easy to do. Anybody can use it. Who knows SQL? Another cool thing, it's not MapReduce. It's mostly in memory. We only read from disk once. So the query results come back really fast. We have a research paper about this as well. But what's great for me is I'm a developer. I like to use this stuff externally. So we've made this external. You can actually use it via an API. We have people building apps on it. We have people integrating with their own apps. Is that a public service? Public API? Is there terms of service to that? Yeah, exactly there is. You can go to developers.google.com slash BigQuery to check it out. It's a product you can use right now. And people are building stuff. What's the use case for that one? Oh, all kinds of things. You can log analysis. People are using it for social media, social games, pulling in information about the games, pulling in terabytes of data and asking questions about them quickly. People are building dashboards. That's another common use case. All kinds of stuff. Ads, ad metrics is a great one as well. Any time you have tons of data and you need to query it quickly, this is the right tool for you. Well, you have the data. I mean, who has got the data, right? So what's the conversation like at turn? I mean, you've got to be looking at that saying, all right, what else can we do with this data? How can we add more value with this data? It's more like we have to rethink the way we do things. So here's an example. If you have a relational database and you need to do a table scan, a full table scan of that data, that'll give people heart attacks. Like, we often joke about a DBA who has to do a full table scan on a relational database. They're going to have a hard time. But at Google, we go, well, why don't we just do a table scan and then figure it out from there? Like, we kind of invert our thinking about it because we're forced to do that. And so that's why Dremel came out of that process. You don't actually go around saying word data science. They actually have a requirement in the special algorithm of hiring. If you can't add the proper way and do the differential equation, you're out. Oh, by the way, you didn't go to the right school, so you're bounced out. Nope, I can't comment on that. But we are coming to problem solvers. We're forced with these big problems and we just have to deal with them. So that's what happens. I had a friend who used to work at Google. I won't say his name to protect the innocent. He used to get in there. He's like, I have no idea. He went to a school you wouldn't even want to know. And he says, I don't know how I got in. I must have slipped through the algorithm. We always have that fear that somehow we slipped through the process. It's diversity. Someone needs, you know. I got to say, that's one of the best things I've worked in there is there's smart people everywhere. They're almost too smart. It's great. All right, so you're a smart person. Give us your macro view of, I can tell already, you had a very high clock speed mentally. It's all the coffee. It's all the coffee. Multicore going on there. So give us the view of the conference from your perspective, because obviously, you're inside the roast. But there's a lot of people out there. I mean, getting kicked out at the door is not enough rooms, packed house. I mean, I was talking to Dave about this earlier. And I think the thing about this conference this year is we're gluing together all the parts of big data. So we have all these tools that are specialized in a particular use case. We've got BigQuery is great for really fast queries over data. Collecting data, that's an H base. That's really good for that. And we're putting together all the systems that are filling in the gaps. So now we have services coming out that are doing, you were talking about relational and non-relational together. We have services that are people building apps in the cloud and you just put the whole big data app in the cloud. You don't have to do anything more. BigQuery is like that, too. It's a service you just use. You just sign up. It's an API. It's easy to use. So this is the year of filling in the gaps for the data pipeline, the big data pipeline. Mass amounts of data are now easier to do and people are finding solutions that are just easy to spin up. She's talking about the data in the cloud, the Google Compute Engine, and talk about that a little bit. How's that going? The hardest part has to be getting the data into the cloud. They won't let us in, by the way. They promised that Google Lio that we'd get in. Guys, look at my request. I'll ask the computer guys about it. So I'm not actually on the Compute Engine team, but I mean, one thing about that is you're right about getting data into the cloud. That's a bottleneck for a lot of people. So everyone's looking into this problem. The fact is the internet is slow. You have a buddy, Ilya, who gave the GitHub talk here at Strata, and he's on the Make the Web Faster team at Google. There's actually a team about that. And he's always talking about how the bottleneck is like the internet's actually kind of slow, and that's one of me, one blocker. Well, you guys solved the speed of light problem with Spanner. Spanner is awesome. If you guys don't know what this is, we have a research paper about Spanner. It's sort of our large-scale database system. You should definitely check it out. Google it. Google actually wants to make the way. I got to give Google a lot of credit. They're very pro-web, and they want to make it faster. But there's also an economic incentive. The faster the page loads, the faster the clicks, and statistically, it adds up. When you have a look at the market share, a small percent improvement. I mean, Google is all about making things better for everyone. Yeah, I mean, that's maybe true, but also, we do want the web to go faster. We're web users, too, and we want to make that happen. So give the update on code.google.com into a briefing a couple of years ago. You guys are obviously pro-open sources. There's no hiding behind that. You guys are great. What's the update? What's new? People might not know about code.google.com that's going on right now. Code.google.com is great. I use it all the time. That's a great way to share data. We have all sorts of ways to interact with it. And we have a lot of get-users here in Mercurial, and it supports all those as well. So we actually put all of our code, our open source code for BigQuery, up on that site as well. So you should definitely check it out. I mean, I think it's just a great way to go. What about developer support? What kind of support are you guys offering? All kinds of stuff. For example, for BigQuery, we like to interact on Stack Overflow. We know that's where the developers are. So our official developer support is on Stack Overflow. So if you go to the Google-BigQuery tag, that's where you'll find us. Engineers hang out there all day long, and we love to answer your questions. It's just a great way to meet with developers. We love it. Can you talk more about your developer program? Yeah, definitely. So we have a large amount of people who are on the developer relations team at Google. And our job is to just evangelize both outside of Google and internally as well. So we want to make sure that developers are being heard, and we want to bring that feedback back into Google to make the products better. That's definitely what we try to do. We have tech writers as well writing great documentation. My job is to go to conferences, talk to developers, write sample code, see what the best practices are around our APIs. So what are you hearing from the community? What are they telling you? The Big Data community. Well, I'm a BigQuery guy. So I interact with Big Data people all the time. They love BigQuery. BigQuery is like what I think of as the future of Big Data. It's easy to use. It's super powerful. It's easy to integrate. It uses conventions that web developers are used to. And so I just think it's the kind of model that's going to be found more in the future, these kind of restful APIs. So I got to ask this, because this is what's hot right now in Paula. Obviously, real-time querying. They're kind of dancing around. Some people are saying it's not there, baked out yet. Obviously, Jeff Hamaback, a great vision. It's built off MapReduce. You guys are living that dream right now with BigQuery and BigTable. So what's your take on the Apollo? I like it. So here's what I feel. I guess it's an open-source project. I need to learn more about it. But I think it's great. I want to see more projects. If there's more competition for BigQuery, that's great. If there's more people looking into this need of real-time analytics or quick analytics on large data sets, that's great. So yeah, I look forward to seeing what they're doing. And I want to see more open-source projects around that. We love that. And I was just talking to Dave about this. Features that are found in other applications get kind of moved around. Like you see people with non-relational databases getting relational features and vice versa. And I think that can work as well for everything across the big data spectrum. Well, we've seen the database evolution where it's a general purpose in a shift to specialize, object store, OLAP, now big data. And it goes back to general purpose again. So we're kind of seeing this this year. I think we're just shaking out a new industry. And we're trying to figure out what works and what doesn't work, what's too expensive, and we're figuring out this year. And I think in the upcoming years, you're going to see some models actually predominate when they get it all figured out. It reminds me a little bit of what Ethernet used to be. There used to be competing protocols. And we're kind of in that space where we're trying to figure out what works best, what's cheap enough, what people want to do. A lot of wealth was created during that generation. We've got three columns, Cisco and the Internet working whole, TCPIP enabled an entire shift of... Big data is full of buzzwords, but I think there's some validity in what's going on. I mean, it's like people are trying to figure out what's going on, but I think we do have needs, and Google has shown that there's needs to process volumes of data. Let me ask you that question. The other thing was we lived through that era together and Dave and I as well. So let's talk about that. The enabler of TCPIP enabled a whole bunch of stuff to happen. What's the big enabler? What's the disruptive enabler in this market today? If you could put your finger on it. So just speaking for myself, I think it's making this technology invisible is really what it's all about. Like not having to think about what you're doing with big data. Not having to build your own infrastructure, not having to worry about this stuff. But is there a technology? Can you put your hands in that protocol or this element is gonna, that's the lever, that's enabling the disruption. I think it's really about usability. I think that's what's gonna make this stuff really, really useful. I mean, I was just talking to some of my colleagues today about we're looking, still looking for success cases. Like Google is a success case in the big data world, but I want to see other people build success cases. Yeah, exactly. I want to have more stories. And so I think the jury's still out about what are the success cases in this space. And I want to see more of that. But I think the technology's already there. It's just making it more usable, more accessible. Yeah, awesome. And we believe it too. We love, we have our own big data project, SiliconANGLE, Wikibon. We, our website at SiliconANGLE for the Google folks are watching or anyone else is that we don't have any ads on our site. Wikibon's free content. We're free content. No banner ads. All open source. All open source content. And we monitor the crowd using big data and use predictive analytics to figure out the stories. And you can do that now. We are doing it. You're able to do that. Could you even do that five years ago? Maybe not as easily? It was hard with structured, with my SQL database. You might have been able to do it five years ago. Google, what a... The NSA and Google could do it, but not SiliconANGLE, Wikibon, I'm sure. But we're going to change that. That's what we're trying to do. Schema problem was a really big deal on the general purpose in the developer market because you had to manage schemas. And it was hard. Large scale data, it's a nightmare. Yeah, we want to make that easy and invisible and accessible. What do you think about MongoDB? Oh yeah, I use MongoDB. I think MongoDB is very, for example, it's complementary with other technologies. It's great for having a database non-relational. It's a document store. I want to take that data and analyze it, stick it into BigQuery. So you believe that the vision of merging data sets is really where we're getting to? I mean, that's where value is, right? That's a developer environment. Yeah, exactly. So what's happening is startups are gluing this stuff together, right? Like duct-taping all these technologies together. And we're trying to figure out what works best. And there's going to be companies coming out that understand the best practices and just building that one service for everyone. And there's some guys here doing that. We believe in the same thing. We think you can go into a data market and figure out what's going on, but you've got to be able to integrate into it and use other data sets in real time. To me, I think that patchwork will create new solutions. Definitely. There's a lot of activity around that. And great. Okay, well, final question for you. Big prediction for the next couple of years. The usability, just to the arrow forward. I think, so we were talking about cloud and how I think the internet is slow, but I think there's a lot of value in getting data into the cloud because there's a lot of opportunity there. There's a lot of great advantages to doing that. For example, interacting with data with the RESTful API. So I think there's going to be a lot more services like BigQuery in the cloud. There's going to be transformation services. There's going to be more storage services. And there's going to be more analytics services. I think that's where it's all going. So that's my prediction. Okay. Well, that's a wrap for this segment. It looks like day one's coming to an end. John Furrier, Dave Vellante, will be right back with a wrap up after the short break on the queue on siliconangle.tv. Great.