 Live from Las Vegas, it's theCUBE! Covering AWS re-invent 2019. Brought to you by Amazon Web Services and Intel, along with its ecosystem partners. Hey, welcome back to theCUBE, Lisa Martin, at AWS re-invent 19. This is day three of theCUBE's coverage. We have two sets here, lots of CUBE content. I'm joined by Justin Warren, the founder and chief analyst at Pivot9. Justin, how's it going? Great, great. You still have a voice after three days. Just barely, I've been trying to take care of it. I'm impressed, and you probably have talked to at least half of the 65,000 attendees, yeah? I'm trying to talk to as many as I can, yeah. Well, we're going to talk to another guy here, joining us from DataIco as Will Novak, the Solutions Architect. Welcome to theCUBE. Thanks for having me. You have a good voice too after three days at re-invent. Yeah, I've been doing the best I can, drink a lot of tea. Yeah, tea is good, tea is good. So DataIco, interesting name. Let's start off by sharing with our audience who DataIco is and what you guys do in technology. Yeah, so the etymology of DataIco, it's like hi-coos for data. So we say we take your data and we make poetry out of it. We make your data so beautiful. I like that. Wow. Now, but for those who are unaware, DataIco is an enterprise data science platform. So we provide a collaborative environment for we say coders and clickers, kind of business analysts and native data scientists to make use of organizations data, build reports, and build predictive machine learning based models and deploy them. How long have you guys been around? Been around for eight years. Eight years, okay. So start up still. Born in the cloud. Yeah, after sure. Saw the opportunity there that data is no longer a liability, it's an asset or should be. Yeah, so we've been server based from the start which is one of our differentiators. And so by that we see ourselves as a collaborative platform. Users access it through a web browser, log into a shared space, can share code, can share visual recipes as we call them to prepare data. Okay, so what are customers using the platform to do with machine learning is pretty hot at the moment. I think it might be nearing the peak of the hype cycle. It's pretty hot. Yeah, so what are customers actually actively doing on the platform? Yeah, no, so we really focus on enabling the enterprise. So for example, GE has been a customer for some time now. And so GE is a great prototypical example in that they have many disparate use cases, like simple things, like doing customer segmentation for marketing campaigns, but also stuff like IoT predictive maintenance. So the use cases kind of run the gamut. And so data IQ based on open source, we're enabling all of GE's users to come into a centralized platform, access their data and manipulate it for whatever the purposes may be. When you talked about marketing campaigns for a second, I'm wondering, is there integration with CRM technologies or how would a customer like, wanting to understand customer segmentation or how to segment it for a marketing campaign, how would they work in conjunction with the CRM and data IQ for example? It's a great question. So again, us being a platform, we sit on a single server, something like an Amazon EC2 instance, and then we make connections into an organization's data sources. So if you're using something like Salesforce, we can seamlessly pull in data from Salesforce, you can manipulate it in data IQ, but at the same time, maybe I also have some Excel file, someone emailed me, I can bring that into my data IQ work environment and I also have a Redshift data table. All those things would come into the same environment. I can visualize, I can analyze and I can prepare the data. I see. So you say it's based on open source, I'm a long time fan of open source, been involved in it for longer than I can remember actually. That's an interesting way to base your product on that. So maybe talk us through how you came to found the company based on and basing it on open source. What led to that choice? What was that decision based on? Yeah, for sure. So you talked about how the hype cycle AI is hot, how hot is AI? And so I think again, our founders astutely recognize that this is a very fast moving place to be. And so kind of betting on one particular technology can be risky. And so instead of it by being a platform, we say like SQL has been the data transformation language du jour for many days now. So of course, you can easily write SQL and a lot of our visual data transformations are based on the SQL language, but also something like Python. Again, it's like the language du jour for machine learning model building right now. So you can easily code in Python, maintain your Python libraries in data IQ. And so by leveraging open source, we figured we're making our clients more future proof. So long as they're staying in data IQ, but using data IQ to leverage the best in breed and open source, they'll always be kind of where they want to be in the technological landscape, as opposed to locked into some tech that is now out of date. What's been the appetite for making data beautiful for a legacy enterprise like a GE that's been around for a very long time, versus a more modern either born in the cloud or as our CEO says, reborn in the cloud. What are some of the differences, but also similarities that you see in terms of we have to be able to use emerging tech. Otherwise someone's going to come in behind us and replace us. Yeah, I mean, I think it's complicated in that there's still a lot of value to be had and someone says like a bar chart you can rely on. Right, so it's maybe not sexy, but having good reporting and analytics is something that both, you know, 200 year old enterprise organizations and data native organizations, startups need. At the same time, building predictive machine learning models and deploying those as REST API endpoints that developers can use in your organization to provide a data-driven product for your consumers. Like that's a more advanced use case that everyone kind of wants to be a part of. And again, DataIQ is a nice tool, which says, you know, maybe you don't have developers who are very fluent in turning out flashed applications. We can give you a place to build a predictive model and deploy that predictive model, saving you time to write all that code on the backend. One of the themes of the show has been transformation. So it sounds like DataIQ would be, it's something that you can dip your toes in and start to get used to using, even if you're not particularly familiar with same machine learning model building. Yeah, that's exactly right. So a big part of our product, and I encourage watchers to go try it out themselves and go to our website, download a free version, free trial, but is enablement. So, you know, if you're the most sophisticated, applied math PhD there is, DataIQ is a great environment for you to code it and build predictive models. If you've never built a machine learning model before, you can use DataIQ to run visual machine learning recipes, we call them, and also we give you documentation which says, hey, this is a random forest model. What is a random forest model? We'll tell you a little bit about it, and that's been another thing that some of these enterprises have really appreciated about DataIQ. It's helping upscale their user base. In terms of that transformation theme that Justin just mentioned, which we're hearing a lot about, not just at this show, it's a big theme, but we hear it all the time, right? But in terms of a customer's transformation journey, whatever you want to call it, cloud is going to be an essential enabler of being able to really leverage value from AI. So, I'm just wondering from a strategic like positioning standpoint, is DataIQ positioned as a facilitator or as fuel for a cloud transformation that an enterprise would undergo? Again, yes, great point. So for us, and I can't take the credit, this credit goes to our founders, but we've thought, you know, from the start, the cloud's an exciting proposition. Not everyone is there still in 2019. Most people, if not all of them, want to get there. Also, people want to tend, many of our clients want to be multi-cloud. And so DataIQ says, if you want to be on-prem, if you want to be in a single cloud subscription, if you want to be multi-cloud, again, as a platform, we're just going to give you a connection to your underlying infrastructure. You can use the infrastructure that you like and just use our front-end to help your analysts get the value they can out of the data. I think a lot of vendors across the entire ecosystem are understanding that customer choice is really important and the customers, particularly enterprise customers, want to be able to have lots of different options. And not all of them will be ready to go completely all in on cloud today. It may take them years, possibly decades, to get there. So having that choice, it sounds like it's something that it will work with you today and will work with you tomorrow, depending on what choices you make. Yeah, that's exactly right. And another thing we've seen a lot of, too, that I think DataIQ helps with, and whether it's DataIQ or other tools, of course you want best in breed, but you also want, particularly for a large enterprise, you don't want people operating in a wild west, particularly in the ML data science space. So we integrate with Jupyter Notebooks, but some of our clients who come to us initially just have, I won't say rogue, because that has a negative connotation, but maybe I will say rogue. Rogue data scientists are just tapping into some data store, they're using Jupyter Notebooks to build a predictive model, but then to actually productionalize that to get sustainable value out of it, it's too one off. And so having a centralized platform like DataIQ where you can say, hey, this is where we're going to use our central model repository, that's something where businesses, they can sleep easier at night because they know, where is my ML development happening? What's happening in one ecosystem? What tools is it happening with? Well, best in breed of open source. So again, you kind of get best of both worlds with the tool like DataIQ. Yeah, so it sounds like it's more about the operations of machine learning is really, really important rather than just, it's the pure technology. Yes, that's important as well. And you need to have the data scientists to build it, but having something that allows you to operationalize it so that you can just bake it into what we do every day as a business. Yeah, I think in a conference like this, all about tech, it's easy to forget what we firmly believe, which is AI and maybe tech more broadly. It's like still human problems at the core, right? Like once you get the tech right, the code runs correct if the code is written correctly. Therefore, like human interactions, project management, model deployment in an organization. These are really hard human centered problems, but so having tech that enables that human centered collaboration helps with that we find. Let's talk about some of the things that we can't ever go to an event and not talk about. And that is with respect to data, quality, reliability and security. How does data IQ facilitate those three cornerstones? Yeah, for sure. So again, viewers, I would encourage to check out. So data IQ has some nice visual indications of data quality. So an analyst or a data scientist can come in very easily understand, does this quality conform to the standards that my organization has set? And what I mean by standards, that can be configured, right? So does this column have the appropriate schema? Does it have the appropriate cardinality? These are things that an individual might decide to use. And then for security, so data IQ has its own security mechanisms. However, we also, to this point about incorporating best and breed tech will work with whatever underlying security mechanisms, organizations have in place. So for instance, if you're using AWS and you have IAM roles to manage your security, data IQ can port those and kind of apply those to the data IQ environment. Or if you're using something like on premise Hadoop, we can use something like Kerberos as a technology to again manage access to resources. So we're taking the best and breed that this organization already has invested time, energy and resources into and saying, we're not trying to compete with them, but rather we're trying to enable organizations to use these technologies efficiently. Yeah, I like that consistency of customer choice. We spoke about that just before. And I'm seeing that here with the choices around, well, if you're on this particular platform, we'll integrate with whatever the tools are there. People underestimate how important that is for enterprises, that it has to be a heterogeneous environment. Playing well with others is actually quite important. Yeah, to that point, like the combination of heterogeneity, but also uniformity, it's a hard balance to strike, but I think it's really important. Giving someone a unified environment, but still choice at the same time, it's like a good restaurant or something. Like you want to be able to pick your dish, but you want to know that the entire quality is high. And so having that consistent ecosystem, I think really helps. What are, in your opinion, some of the next industries that you see that are really ripe to start really leveraging machine learning to transform. You mentioned GE, a very old legacy business. If we think of what happened with the ride-hailing industry, Uber for example, or fitness with Peloton, or Pinterest with Visceral Search, what do you think is the next industry that's like, you guys taking advantage of machine learning will completely transform this in our lives? Yeah, I mean, the easy answer that I'll give, because it's easy to say it's going to transform, but hard to operationalize is healthcare, right? So there is structured data, but the data quality is so disparate and heterogeneous. So I think, if organizations allow this, again it's a human-centered problem, right? If people can decide on data standards and also data privacy is of course a huge issue, we talked about data security internally, but also as a customer, what data do I want, this hospital or this healthcare provider to have access to? So that's human issues we have to resolve. But conditional on that being resolved, and thus figuring out a way to anonymize data and respect data privacy, but have consistent data structure, and we get to, hey, let's really set these AI ML models loose and figure out things like personalized medicine, which we're starting to get to, but I feel like there's still a lot of room to go. That sounds like it's going to, it's an exciting time to be in machine learning. People should definitely check out products such as data IQ and see what happens. Last question for you is, so much news has come out in the last three days. It's mind-boggling. Some of the takeaways of some of the things that you've heard from Andy Jassy to Werner this morning. Yeah, I think a big thing for me, which was something for me before this week, but it's always nice to hear Amazon reassures. I have the concept of white box AI. We've been talking about that at DataIQ for some time, but everyone wants to performant AI or ML solutions, but increasingly there's a real appetite publicly for interpretability, and so you have to be responsible, you have to have interpretable AI, and so it's nice to hear a leader like Amazon echo that and at DataIQ that's something we've been talking about since our start. A little bit validating then for DataIQ? For sure, yeah, for sure. Nice. Well, well, thank you for joining Justin and me on theCUBE this afternoon. We appreciate it. Appreciate it. All right, from my co-host, Justin Warren, I'm Lisa Martin, and you're watching theCUBE from Vegas. It's AWS reInvent 19.