 From San Jose, California, it's theCUBE, covering Big Data Silicon Valley 2017. Hey, welcome back everybody. Jeff Frick here with theCUBE. We're at Big Data SV, wrapping up with two days of wall-to-wall coverage at Big Data SV, which is associated with Stratocomp, which is part of Big Data Week, which always becomes the epicenter of the Big Data World for a week here in San Jose. We're at the historic Pagoda Lounge, and we're excited to have our next two guests talking a little bit different twist on Big Data that maybe you hadn't thought of. We've got Robbie Darnacota. He is the Chief Enterprise Architect at SnapLogic. Welcome. Hello. And he has brought along a customer, Katherine Matsumoto. She is a data scientist at ERO. Welcome. Thank you. Thanks for having us. Absolutely. So we had a SnapLogic on a little earlier with Garovs, but tell us a little bit about ERO. I'd never heard of ERO before for folks that aren't familiar with the company. Yeah, so ERO is a startup based in San Francisco. We are sort of driven to increase home connectivity, both the performance and the ease of use, as Wi-Fi becomes totally a part of everyday life. We do that. We've created the sort of world's first Wi-Fi mesh Wi-Fi system. So that means you have an average home, three different individual units, and you plug one in to replace your router, and then the other three get plugged in throughout the home just to power, and they're able to spread coverage, reliability, speed throughout your home, so no more buffering dead zones in that way back bedroom. And it's so it's a consumer product. Yes. So you've got all the fun challenges of manufacturing. You've got the fun challenges of distribution, consumer marketing. So a lot of challenges for a startup. You guys are doing great. Why SnapLogic? Yeah, so in addition to the challenges with the hardware, we also are a really strong software. So everything is set up via the app. We are not just the backbone to your home's connectivity, but also part of it. So we're sending a lot of information back from our devices to be able to learn and improve the Wi-Fi that we're delivering based on the data we get back. So that's a lot of data, a lot of different teams working on different pieces. And so when we were looking at launch, us do, okay, how do we integrate all of that information together to make it accessible to business users across different teams? And also how do we handle the scale? It seemed like a really, I made a checklist and SnapLogic was the only one that really seemed to be able to deliver on both of those promises with a look to the future of like, I don't know what my next SaaS product is. I don't know what our next API point we're gonna need to hit is sort of the flexibility of that, as well as the fact that we had analysts were able to pick it up, engineers were able to pick it up and I could still manage all of the software written by or the pipelines written by each of those different groups without having to read whatever version of code that they're writing. So Ravi, we heard you guys are like doubling your customer base every year and lots of big names, Adobe we talked about earlier today. But I don't know that most people would think of SnapLogic really as a solution to a startup mesh network company. Yeah, yeah, yeah, absolutely. So that's a great point. So let me just start off with saying that in this new world, we don't discriminate, we integrate. Integrate and we don't discriminate between, and this new world that I speak about is social media. Do you bus? So I will get to that. So social, mobile, analytics and cloud. And in this world, people have this thing, which we fondly called integrators dilemma. You wanna integrate apps, you go to a different tool set, you integrate data, you start thinking about a different tool set. So we wanna dispel that and really provide a unified platform for both apps and data. So remember when we're seeing all the apps move into the cloud and being provided as services, but the data systems are also moving to the cloud. You got your data warehouses, databases, your BI systems, analytical tools, all are being provided to you as services. So in this world, data is data. If it's apps, it's probably schema mapping. If it's data systems, it's transformations moving from one end to the other. So we're here to solve both those challenges in this new world with a unified platform. And it also helps that our lineage and the brain trust that brings us here. We did this a couple of decades ago and we're here to reinvent that space. We expect you to bring Clayton Christensen on next time you come to visit because he needs a new book and I think they can do one. But I think it's a really interesting part of the story though too is that you have such a dynamic product, right? If you looked at your boxes, I've got the website pulled up, you wouldn't necessarily think of the dynamic nature that you're constantly tweaking and taking the data from the boxes to change the service that you're delivering. It's not just this thing that you made to a spec that you shipped out the door. Yeah, and that's really where the auto connected, we did 20 firmware updates last year, which is totally, we had problems with customers would have the same box for three years and the technology changed, the chips changed, but their wifi service is the same and we're constantly innovating and being able to push those out. But you need, if you're gonna do that many updates, you need a lot of feedback on the updates because things break when you update sometimes and we've been able to build systems that catch that, that are able to identify changes that say, not one person could be able to do by looking at their own things or just with support. We have leading indicators across all different sorts of different stability and performance and different devices. So if Xbox changes how they, their protocols, we can identify that really quickly. And that's sort of the goal of having all the data in one place across customer support and manufacturing. It's we can easily pinpoint where in the many different complicated factors you can find the problem, yeah. So I've actually got questions for both of you. Ravi, starting with you. It sounds like you're trying to tackle a challenge that in today's tools would include Kafka at the data integration level. And there it's very much a hub and spoke approach. And I guess it's also, you would think of the application level integration more like the TIPCO and other EAI vendors in a previous generation, which I don't think was hub and spoke. It was more point to point. And I'm curious how you resolve that. You know, how, in other words, how you tackle both together in a unified architecture. Yeah, yeah, that's an excellent question. In fact, one of the integrated dilemma that I spoke about, you know, you've got the problem set where you've got a high latency, high volume, where, you know, you go to ETL tools and then the low latency, low volume, you immediately go to the tip codes of the world and that's ESB, EAI sort of tool sets that you look to solve. What we have done is we, you know, we've thought about it hard and at one level, we've just said, why can integration not be offered as a service? So that's step number one where you don't, you know, the design experiences through the cloud and then execution can just happen anywhere behind your firewall or in the cloud or in a big data system. So it caters to all of that. But then also the data set itself is changing. You're seeing a lot of the document data model that are being offered by these SAS services. So the old sort of, you know, ETL companies that were built before all of this social mobile sort of stuff came around was all row and column oriented. So how do you deal with the more document oriented JSON sort of stuff? And we built that for the platform for to be able to handle that kind of data. Streaming is an interesting and important question pretty much everyone I spoke to last year was, you know, streaming was a big, you know, that's just streaming. I want everything in real time. And so, you know, but batch also has its place. So you've got to have a system that does batch as well as real time or near real time as needed. So we solve for all of those problems. Okay. So Catherine, coming to you, you're, each customer has a different, well, every consumer has a different, essentially a install base and to bring all the telemetry back to make sense out of what's working and what's not working or how their environment is changing. How do you make sense out of all that considering that it's not B2B, it's B2C, so there's, you know, I don't know how many customers you have but it must be in the tens or hundreds of thousands of dollars. But it's the distinctness of each customer that I think that I gather makes the challenge, the support challenge for you. Yeah, and part of that's exposing as much information to the different sources and starting to automate the ways in which we do it. There's certainly a lot of, we are very early on in, as a company, we've hit our year mark for public availability at the end of last month, so thank you. It's been a long year. But with that, you know, we learn more constantly and different people come to different views, as different new questions come up. The sort of special snowflake aspect of each customer, there's a balance between how much actually is special and how much you can find patterns. And that's really where you get into much more interesting things on the statistics and in machine learning side is how do you identify those patterns that you may not even know you're looking for? Like we are still beginning to understand our customers from a qualitative standpoint. So it actually came up this week where I was doing an analysis and I was like, this population looks kind of weird. And, you know, with two clicks was able to send out a list over to our CX team. They had access to all the same systems because all of our data is connected and they could pull up the tickets based on, you know, because through SnapLogic, we're joining all the data together. We use Looker as our BI tool. They were just able to start going into the tickets and doing a deep dive, and that's being presented later this week as to like, hey, what are this population doing? So for you to do this, that must mean, that must mean you have at least some data that's common to every customer for you to be able to use something like Looker, I imagine that if everything, if every customer was a distinct snowflake, be very hard to find patterns across them. So- Well, I mean, you know, look at how many people have iPhones, have MacBooks. You know, there are a lot of, we are, we're looking at a lot of sort of aggregate level data in terms of how things are behaving. And the, you know, the always the challenge of any data science project is creating those feature extractions. And so that's where the process we're going through as a data or the analytics team is to just start extracting those things and adding them to our central data source. And that's one of the areas also where having very integrated analytics and ETL has been helpful as we're just feeding that information back into everyone. So once we figure out, oh, hey, you know, this is how you differentiate small businesses from homes, because we do see a couple small businesses using our product, that goes back into the data and now everyone's consuming it. And so the, each of those common features, it's a slow process to create them, but it's also, you know, increases the value every time you add one to the central group. To- One last question. Well, I was just gonna, just an interesting way to think of the Wi-Fi service and the connected devices as an integration challenge as opposed to, it's just this appliance that, you know, just kind of works like an old POTS line, which it isn't clearly at all, with 20 firmware updates a year. Yeah. Yeah, there's another interesting point that, you know, we were just having this discussion offline. It's, you know, it's a startup. They obviously don't have the resources or the appetite to have a large IT department to set up these systems. So, you know, as Catherine mentioned, one person team initially when they started and to be able to integrate, you know, who knows which system is going to be next. Maybe they experiment with one cloud service. It perhaps scales to their liking or not. And then they quickly change and go to another one. You know, you cannot change the integration underneath that. You got to be able to adjust to that. So that flexibility and the other thing is, you know, what they've done with having their businesses sort of become self-sufficient is another very fascinating thing. So give them the power. Why should IT or, you know, that small team become the bottleneck? Don't come to me. I'll just empower you with the right toolset and the patterns. And then from there, you know, you change and put in your business logic and, you know, be productive immediately. Let me drill down on that. Because my understanding at least in the old world was that ETL was kind of brittle. And if you're constantly part of actually the genesis of Hadoop, certainly at Yahoo was, we're going to bring all the data we might ever possibly need into the repository so we don't have to keep rewriting the pipeline. And it sounds like you have the capability to evolve the pipeline rather quickly as you want to bring more data in from into this sort of central resource. Am I getting that about right? Yeah, it's a little bit of both. So we do have that central, you know, I think data likes the fancy term for that. So we're bringing everything into S3, jumping it into those raw JSONs, you know, whatever nested format it comes into. So, you know, whatever makes it so that extraction is easy. But then there's also as part of ETL, there's that last mile, which is a lot of business logic. And that's where you run into teams starting to diverge very quickly if you don't have a way for them to give feedback into the process. And so we've really focused on empowering business users to be self-service in terms of answering their own questions. And that's freed up our analysts to add more value back into the greater group as well as answer harder questions that, you know, both beget more questions, but also feedback insights into that data source and because they have access to, you know, sort of their piece of that last leg business logic, you know, by changing the way that one JSON field maps or combining two, they've suddenly created an entirely new variable that's accessible to everyone. So it's sort of last leg, you know, business logic versus the full transport layer. So we have a whole platform that's designed to transport everything and be much more robust to changes. All right, so let me make sure I understand this. It sounds like the less trained or more self-sufficient, they go after the central repository, and then the more highly trained and scarcer resource, they're responsible for owning one or more of the feeds and that they enrich that or make that more flexible in general purpose so that those who are more self-sufficient can get at it in the center. Yeah, and also you're able to make use of the business. So we have a sort of hybrid model with our analysts so they're really closely embedded into the teams. And so they have all that context that you need that if you're relying on, say, a central IT team, that you have to go back and forth of like, why are you doing this? What does this mean? You know, they're able to do all of that in logic. And then the platform, the goal of our platform team is really to focus on building technologies that complement what we have with SnapLogic or others that are accustomed to our data systems that enable that same sort of level of self-service for creating specific definitions or are able to do it intelligently based on, you know, agreed upon patterns of extraction. Okay, heavy science. All right, well, fortunately out of time, I really appreciate the story. I'd love to sign off to check out the boxes because I know I have a bunch of dead spots in my house. But Robbie, I want to give you the last word really about how is it working with a small startup, doing some cool innovative stuff, but it's not your dobe's, it's not a lot of the huge enterprise clients that you have. What have you taken, why is that at value to SnapLogic to work with? You know, kind of a cool fund, small startup. Yeah, look, so the enterprise is always a retrofit job. You have to sort of go back to the SAPs and the Oracle databases and make sure that we're able to connect the legacy with the new cloud applications. Whereas with a startup, it's all new stuff, but their volumes are constantly changing. They probably have spikes, they have burst volumes, they have new, you know, they're thinking about this differently, enabling everyone else, you know, quickly changing and adopting newer technology. So we have to be able to, you know, sort of adjust that agility along with them. So we're very excited to sort of partnering with them and going along with them on this journey. And as they start looking at other things that, you know, sort of the machine learning and the AI and the IoT space, we're very excited to sort of have that partnership and learn from them and evolve our platform as well. Clearly, you're smiling ear to ear. Catherine's excited, you're solving problems. So thanks again for taking a few minutes and good luck with your talk tomorrow. All right, I'm Jeff Frick, he's George Gilbert. You're watching theCUBE from Big Data SV. We'll be back after this short break. Thanks for watching.