 Hi, Alex Williams here of Silicon Angle, here at the Node Summit. We're once again covering events like no one else, we're the leaders in this space without a doubt. Anyway, here with Paul Freeman of Streetlight Data. Hello. Paul, how are you doing today? Excellent. It's a good day for startups. Good. So let's start off with Node. Node.js. Tell me how you're using it. And first of all, tell us a little bit about your company. So Streetlight Data is a data as a service company and we're focused on information about sort of where, when, why and how are people moving throughout the real world. Like if you compare the information that companies have about, you know, what's going on on their website, who's visiting, what they're looking at, where their mouse is, and you know, there's just a wealth of information about what's going on in the online world. But in the real world, Starbucks could never tell you who's driving by in front of their store, who's walking in versus who's not walking in on any given day. So that's essentially the core of what you do? That's exactly what we do. We sell information about, you know, where and when and why are people going on the real roadways in the real world. Who's driving, you know, past the store at different times of day? Where are cars breaking down? Where should services be provisioned? Where should businesses be located? So how do you use Node.js? We don't right now, actually. And the front end then? On the front end, so our front end is entirely written in JavaScript. And EXTJS is the stack that we've built on. We're using, we're really leveraging a lot of good open source geospatial technologies. The geo server from the open geo folks has just been, just been doing yeoman's work for us. Talent for doing bulk data movement and data integration has been fantastic. And the core back ends that we've been using are Postgres with the post GIS extensions. And the big data stack from cloud era for really handling the heavy lifting. Okay, so what do you find unique about this event? I mean you are talking about lots of data and much of what we're hearing from the people speaking here is about real time data and uses for Node.js to help with that. Yeah, and the real time component is definitely a big one for us. We have a mobile component that is streaming back real time geospatial tracks of where people are driving and as well as performance information. Are you taking that hard right turn? You're driving like a madman or you're driving like somebody's grandmother. That's streaming in in real time and being fed into right now an array of post GIS geospatial databases. Which are then in sort of periodic, you know, we don't need to give Starbucks an up to date count of who drove by at that instant. They really need to know, you know, on average who's coming by that location so they can do a better job of site selection and planning. So we use bulk data movement via talent to populate the big data stack of Hadoop. Talent and cloud era were two of our big data startups to watch this year. Very much so. I think Gartner got it right on talent. I actually did an evaluation of all the different open source ETL tools. And if I was going to pay for it, I'd go for Informatica, but in the world of lower cost open source talent beats everybody else hands down. Why? The completeness, the robustness of their technology, the extensibility of their technology, the breadth of the adapter support they provide out of the box. The connectors? Exactly. And with their version 5, the depth of the Hadoop and Hive support on big data is really superior. Well good. So you use talent for that bulk data? Yes. What is that bulk data? So the bulk data is the data after it's been sort of geospatially processed and normalized out of those Postgres databases, those PostGIS databases. Inside there we're taking all this noisy dirty GPS data streaming off the mobile phones and cleaning it up to really see what road are you driving on, how fast are you driving. And that's the bulk data that we're pushing into a Hive data warehouse to do our larger analytics to enable people to say, okay, in this neighborhood how safe are people driving or where are women going shopping at 2 in the afternoon and on what roads are they driving? So that's where the Hadoop integration comes into play? Exactly. Just processing those millions upon millions of records of where and why people are driving. It's doing sort of classic data warehouse big data analytics. But you're using cloud error? We're leaning towards cloud error right now, evaluated them against data stacks and are leaning towards cloud error. Why? I'd say robustness, completeness of the solution, and I think they're just a little further along. I like the failure mode and sort of greater reliability of data stacks, but I think cloud error has got the edge and with their relational integration via the Oracle partnership and others. I think it's better suited for us. What other big data storage options out there? There's more distributed storage options available now. Yep, evaluated the one that's being spun out of Lexus Nexus right now. They've got a large storage option. Right, HPCC. Exactly, HPCC looked at that. A little too proprietary, a little sort of too closed of an environment is what we saw. But they've opened it up. They have, but it's still early. We got one of the earlier evaluation copies last year of that. It's a question of will that gain traction? Will it catch on and will the ecosystem grow for that? Where the generic Hadoop stack is definitely being embraced by the big data vendors. I mean, I'd hate to be sitting in terror data shoes today. Well, I'd like to get to that. But there's distributed storage options like Red Hat Cluster. You could use Red Hat Cluster and they're going to be integrating that into OpenShift as well. Yep, yep. For us it really, we like the high sort of pseudo relational layer on top of it. We're not looking for sort of a pure tag value type of a store which a lot of the stacks have been focused on. Because we want to do sort of real classic data warehouse data mark type queries where we're doing aggregated calculations on the data. There's also a lot more sort of knowledge out there for sort of SQL and SQL like support in terms of code generation, in terms of JDBC integration and front end connectivity. So trying to stay as sort of relational like as we can. So you guys really explore this. We only have a few minutes left. Let's end it just on where you guys are going with the company and how you're going to really adapt to that geospatial element that really is part of the data that you're collecting. Oh, it's core to our business because we need to know where people are, what road you're driving on, what the weather was at that point in time and at that point in space and the other external factors so that not only is time an important dimension but location is a critical dimension for us. And we found that the open source in academic community support for that is just fantastic. But it hasn't percolated its way up into the big data stacks yet. So we're still relying on the relational databases to do that geospatial processing. That's tough. Yeah, I mean as soon as somebody supports over to Hadoop and or Hive, some really good GIS support, we'll be all over that. There's an opportunity there. Well, thank you very much, Paul, for taking some time to talk with us. No, thank you. Paul Friedman of Streetlight Data, the Chief Technology Officer. I'm Alex Williams with SiliconANGLE. We'll be right back from the Node Summit.