 Hi Alex Williams here of Silicon Angle here at the Node Summit. We're once again covering events like no one else. We're the leaders in this space without a doubt. Anyway, here with Paul Freeman of Streetlight Data. Hello. Paul how are you doing today? Excellent. It's a good day for startups. Good. So let's start off with Node. Node.js. Tell me how you're using it. And first of all tell us a little bit about your company. So Streetlight Data is a data-as-a-service company and we're focused on information about sort of where, when, why, and how are people moving throughout the real world. Like if you compare the information that companies have about you know what's going on on their website, who's visiting, what they're looking at, where their mouse is and you know there's just a wealth of information about what's going on in the online world. But in the real world Starbucks could never tell you who's driving by in front of their store, who's walking in versus who's not walking in on any given day. So that's essentially with the core what you do? That's exactly what we do. We sell information about you know where and when and why are people going on the real roadways in the real world. Who's driving you know past the store at different times of day? Where are cars breaking down? Where should services be provisioned? Where should businesses be located? So how do you use Node.js? We don't right now actually. And the front end then? On the front end, so our front end is entirely written in JavaScript. EXTJS is the stack that we've built on. We're using, we're really leveraging a lot of good open source geospatial technologies. The geo server from the open geo folks has just been just been doing yeoman's work for us. Talent for doing bulk data movement and data integration has been fantastic. And the core back ends that we've been using are Postgres with the post GIS extensions and the big data stack from cloud era for really handling the heavy lifting. Okay so what do you find unique about this event? I mean you are talking about lots of data and much of what we're hearing from the people speaking here is about real time data and uses for Node.js to help with that. Yeah and the real time component is definitely a big one for us. We have a mobile component that is streaming back real time geospatial tracks of where people are driving and as well as performance information. Are you taking that hard right turn? Are you driving like a madman or are you driving like somebody's grandmother? That's streaming in in real time and being fed into right now an array of post GIS geospatial databases which are then in sort of periodic you know we don't need to give Starbucks an up-to-date count of who drove by at that instant. They really need to know you know on average who's coming by that location so they can do a better job of site selection and planning so we use bulk data movement via talent to populate the big data stack of Hadoop. Talent and cloud era were two of our big data startups to watch this year. Very much so I think Gartner got it right on talent. I actually did an evaluation of all the different open source ETL tools and if I was going to pay for it I'd go for Informatica but in the world of lower cost open source talent beats everybody else hands down. Why? The completeness, the robustness of their technology, the extensibility of their technology, the breadth of the adapter support they provide out of the box. The connectors? The connectors exactly and with their version 5 the depth of the Hadoop and Hive support on big data is really superior. Well good so you use talent for that bulk data yes what is that bulk data? So the bulk data is the data after it's been sort of geospatially processed and normalized out of those postgres databases those post GIS databases where inside there we're taking all this noisy dirty GPS data streaming off the mobile phones and cleaning it up to really see what road are you driving on how fast are you driving and that's the bulk data that we're pushing into into a Hive data warehouse to do our larger analytics to enable people to say okay in this neighborhood how safe are people driving or where are women going shopping at two in the afternoon and on what roads are they driving. So okay so you're using so you're that's where where it's where the Hadoop integration comes into play. Exactly just processing those you know millions upon millions of records of where and why people are driving is doing sort of classic data warehouse big data analytics. What you're using cloud error? We're leaning towards cloud error right now evaluated them against data stacks and are leaning towards cloud error. Why? I'd say robustness completeness of the solution and I think they're just a little further further along I like the I like the failure support the sort of failure mode and sort of greater reliability of data stacks but I think cloud error has got the edge and with their relational integration via the Oracle partnership and others I think it's better suited. What about other big data storage options out there? I mean there's there's a lot of there's more distributed storage options available now. Yep evaluated the one that's being spun out of Lexus Nexus right now they've got a large storage option. Right HPCC. Exactly HPCC looked at that a little too proprietary a little sort of too closed of an environment is what we saw. You know it's but they've opened it up. They have but it's still early. We got one of the earlier evaluation copies last year of that. You know it's a question of will that will that gain traction? Will it catch on and will the ecosystem grow for that? Where the generic Hadoop stack is definitely being embraced by the big data vendors. I mean I'd hate to be sitting in terror data shoes today. Well I'd like to get to that but you know there's other distributed storage options like Red Hat Cluster. You can use Red Hat Cluster and they're going to be integrating that into OpenShift as well. Yep yep for us it really we like the hive sort of pseudo relational layer on top of it. We're not looking for sort of a pure tag value type of a store which a lot of the stacks have been focused on because we want to do sort of real classic data warehouse data mark type queries where we're doing aggregated calculations on the data. There's also a lot more sort of knowledge out there for sort of SQL and SQL like support in terms of code generation in terms of JDBC integration and front-end connectivity so trying to stay as sort of relational like as we can. So you guys really explore this. We only have a few minutes left. Let's end it just on where you guys are going with the company and how you know you're going to you know really adapt to that geospatial element that really is part of the data that you're collecting. Oh it's core to our business because we need to know where people are how it what road you're driving on what the weather was at that point in time and at that point in space and the other external factors so that not only is time an important dimension but location is a critical dimension for us and we found that there's a the open source in academic community support for that is just fantastic but it hasn't percolated its way up into the big data stacks yet so we're still relying on the relational databases to do that geospatial processing to take yeah I mean as soon as somebody supports over to Hadoop and Hive some really good GIS support we'll be all over that. There's an opportunity there. Well thank you very much Paul for taking some time to talk with us. No thank you. Paul Friedman of Streetlight Data the chief technology officer I'm Alex Williams with Silicon Angle. We'll be right back from the Node Summit.