 George Gilbert from Wikibon. We're on the ground at Spark Summit 2015. We're with Justin Langsteth, CEO and founder of Zoom Data. So tell us a little bit about where the inspiration for Zoom Data came from and its unique value add. Yeah, the inspiration came from looking at this big wave of big data technology that was starting to evolve in Hadoop and NoSQL and all that on the back end. And I realized that for the first time in 40 years like SQL Database wasn't the obvious place to put your data anymore. And then on the front end we saw Apple with the iPad and touch interfaces and realized that the desktop computer wasn't necessarily the kind of future interface for humans to interact with data. So we realized both the back end and front end of BI was being disrupted simultaneously and it would be a great opportunity to get into the fresh company in a blank sheet of paper architecturally and figure out what is the best way to actually build the way for humans to interact with all this big data. Okay, so that's a great way of sort of telling us what the catalyst was. Now tell us a little bit more. You're appealing to the business user, maybe starting from the business analyst, going further out into the business community, but that data has to come from somewhere. So where are you getting it from and who are those users collaborating with? Yeah, I mean the data is in all kinds of places. Most enterprises have data in legacy databases, they've data in flat files, they've data in S3 or Amazon cloud, they've data in a Hadoop, they've data in MongoDB or all these NoSQL places, and then they have data in Salesforce.com or Google Analytics. Data is just spread all over the place right now and traditionally you try to ETL that into a single data warehouse, but what people are realizing is that's becoming impossible and we've built a technology based on Spark that can natively connect through to all those underlying repositories and then pull the data only as needed into Spark and try to push as much of the work as underlying data storage as possible and then use Spark as a layer that kind of smooths it all out so that it can be enjoined or fused together to be presented to the end users. So Spark in this particular case sounds like a high-performance execution engine and in the case of Spark running in the cloud, let's say the Databricks cloud, there's some extra value add where with traditional business intelligence tools there's a whole lot of work that goes in with other roles to make it easy and pretty. So how do you leverage this Databricks cloud or other Spark clouds to do that? So traditionally the kind of BI business intelligence world we work with data modelers and then ETL people who would build these data models and ETL and they'd set up the BI tools so there's a lot of steps and usually like a year at least of work before an end user could actually get a dashboard. So we're seeing that timescale obviously shrink really dramatically and what's interesting with Databricks and the notebook support they have in Databricks cloud is that machine data scientists or kind of more advanced code savvy data analysts they can use these notebooks to build up data pipelines to traditionally what you think of as data wrangling or ETL or machine learning or data enrichment or stats or all these kind of stuff and they can kind of do that in the notebooks and then we've integrated directly with Databricks cloud so it zoom data spins up in a little docker container within Databricks cloud and then we've hooked up the metadata so that anything that a user or a more advanced user has created in Databricks cloud a data set or advanced machine learning algorithm that then is automatically pushed into the zoom data metadata so the less technical more businessy type people can just use zoom data but they can get native pre-setup access to all those great artifacts and systems that the more advanced people have set up and as long as the business users trust the person who set up that model they don't necessarily understand how that model works they can leverage it in their business analysis. Okay sounds clearly very powerful but give us like a concrete example when you know you might be a retailer with the point of sale data you might have your inventory data yeah what it what it when you talk about those artifacts and the the metadata what is it that gets passed from the notebook who's the sort of more data science guy yeah or the data modeling guy and how does that then populate and simplify the work of the zoom data user yeah it's a it's a couple things so it's the table definitions themselves so the tables and fields and that data may have come in to spark or Databricks cloud from any number of places and then wrangled or transformed in those notebooks so the end results of that are clean wrangled data that's in tables so that metadata is pushed into zoom data and then zoom data doesn't make a copy the data though it reaches into Databricks cloud to actually execute the queries on that then also like machine learning algorithm so if I build a machine learning algorithm maybe it's a fraud detection algorithm and I build this algorithm I test it as a data scientist I can then deploy that through zoom data directly so that the less people technically don't necessarily understand how this fraud is actually being detected they can still fraud score the transactions as they're either historically being looked at or even in real-time through spark streaming so in real-time would spark streaming be feeding a zoom data dashboard yeah so sparks Kafka or spark streaming or kinesis we support lots of different real-time stream engines and either this day zoom data can accept these streams directly and do continuous processing through zoom data which is based on spark streaming or you can stream the data directly into a fast data store like a mem SQL or something that's really really fast and you can like have zoom data just typically tail that data store and simulate real-time visualization against data that's being loaded into something else without even passing through zoom data directly and so how is your uptake been now you were featured at a sparks meetup which is you know quite high praise what's the awareness and how are customers gravitating to now yeah we're really seeing a lot of people being interested in zoom data especially to embed it in other applications because a lot of people are building data-driven applications or data-driven services or they just have another application that needs reporting or dashboarding or charts or something in it and they don't want to have to kind of use legacy 10 or 20 year old technology if they're building a new system that's going to go last from 10 20 years from now and so they really like the way we've architected the system the way it's built on spark handles real-time and on the front end just how we've built a really rich JavaScript SDK that can be embedded our visuals and other modern JavaScript applications without i-frames without any flash or it is kind of old stuff and so we're really getting a lot of uptake for these people who want to embed white label zoom data or embed it into other applications but leverage this whole stack of stuff that's under the covers okay just in length as CEO and founder of zoom data this is George Gilbert on the ground it's spark summit 2015