 from San Jose. It's theCUBE, presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. Hey, welcome back to theCUBE's continuing coverage of our event, Big Data SV. I'm Lisa Martin with the co-host George Gilbert. We are down the street from the Strata Data Conference. This is our second day, and we've been talking all things Big Data, Cloud Data Science. We're now excited to be joined by the CEO of a company called Data Science, Ian Swanson. Ian, welcome to theCUBE. Thanks so much for having me. I mean, it's been an awesome two days so far, and it's great to wrap up my trip here on the show. Yeah, so, tell us a little bit about your company. Data Science, what do you guys do? What are some of the key opportunities for you guys in the enterprise market? Yeah, absolutely. So my company's called DataScience.com, and what we do is we offer an enterprise data science platform where data scientists get to use all the tools they love in all the languages, all the libraries, leveraging everything that is open source to build models and put models in production. And then we also provide IT, the ability to be able to manage this massive stack of tools that data scientists require, and it all boils down to one thing, and that is companies need to use the data that they've been storing for years, and it's about how do you put that data into action, and we give the tools to data scientists to get that data into action. So let's drill down on that a bit. You know, for a while we thought if we just put all our data in this schema on read repository, you know, that would be nirvana, but it wasn't all that transparent, and we recognize we have to sort of go in and structure it somewhat. Help us take the next couple steps from these partially curated data sets to something that turns into a model that is actionable. And that's actually been the theme and the show here at the Strata Data Conference. If we went back years ago, how do we store data, then it was how do we not just store and manage, but how do we transform it and get into a shape that we can actually use it. The theme of this year is how do we get it to that next step, the next step of putting it into action. So to layer on that, data scientists need to access data, yes, but then they need to be able to collaborate, work together, apply many different techniques, machine learning, AI, deep learning, these are all techniques of data scientists to build a model. But then there's that next step, and the next is, hey, I built this model? How do I actually get it in production? How does it actually get used? And here's the shocking thing. I was at an event where there's 500 data scientists in the audience and I said, stand up if you worked on a model for more than nine months and it never went into production. 90% of the audience stood up. And that's the last mile that we're all still working on. And what's exciting is we can make it possible today. So wanting to drill down into the sort of, it sounds like there's a lot of choice in the tools, but typically to do a pipeline, you either need well-established APIs that everyone understands and plugs together with, or you need a end-to-end sort of single-vendor solution that becomes a sort of collaboration backbone. How are you organized? How are you built? Now this might be self-serving, but data science.com, we have an enterprise data science platform, we recommend a unified platform for data science. Now that unified platform needs to be highly configurable. You need to make it so that that workbench, you can use any tool that you want. Some data scientists might want to use a hammer, others want to be able to use a screwdriver over here. And so the power is how configurable, how extensible it is, how open source, you can adopt everything. The amazing trends that we've seen have been proprietary solutions going back decades to now the rise of open source. Every day, dozens if not hundreds of new machine learning libraries are being released every single day. We got to give those capabilities to data scientists and make them scale. Okay, so the, and I think it's pretty easy to see how you would have incorporate new machine learning libraries into a pipeline. But then there's also the tools for data preparation and for like feature extraction and feature engineering. And you might even have some tools that help you with figuring out which algorithm to select, what holds all that together? So orchestrating the enterprise data science stack is the hardest challenge right now. And so there has to be a company like us that is the glue that is not just do these solutions work together, but also how do they collaborate? What is that workflow? What are those steps in that process? And there's one thing that you might have left out and that is model deployment, model interpretation, model management. And that's where this whole thing is going next. And that was the exciting theme that I heard in terms of all these discussion with business leaders throughout the last two days is model deployment, model management. So if I can kind of take this to a maybe a shift the conversation a little bit to the target audience. Talked a lot about data scientists and needing to enable them. I'm curious about, we just talked with a couple of guests ago about the chief data officer. Yep. How you work with enterprises, how common is the chief data officer role today? And what are some of the challenges they've got that data science.com can help them to eliminate? Yep. So the CIO and the chief data officer, we have CIOs that have been selecting tools for companies to use. And now the chief data officer is sitting down with the CEO and saying, how do we actually drive business results? And so we work very closely with both of those personas. But on the CDO side, it's really helping them educate their teams on the possibilities of what could be realized with the data at hand. And making sure that IT is enabling the data scientists with the right tools. So we supply the tools, but we also like to go in there with our customers and help coach, help educate what is possible. And that helps with the CDO's mission. A question along that front. We've been talking about sort of empowering the data scientist and really from one end of the life cycle, the modeling life cycle all the way out to the end. Or the deployment, which is currently the hardest part and least well supported. But we also have tons of companies that don't have data science trained people or who are only modestly familiar. What do we do with them? How do we get those companies into the mainstream in terms of deploying this? So I think whether you're a small company or a big company, digital transformation is the mandate. Digital transformation is not just how do I make a taxi company become Uber or how do I make a speaker company become Sonos as a smart speaker? It's how do I exploit all the sources in my data to get better and improved operational processes, new business models, increased revenue, reduced operation costs? And so you could start small. And so we work with plenty of smaller companies. They'll hire a couple data scientists and they're able to do small quick wins. You don't have to go sit in the basement for a year having something that is the thing, the unicorn in the business, it's small quick wins. Now my company, we believe in writing code, trained, educated data scientists. There are solutions out there that you throw data at, you push a button, it gets an output. It's this magic black box. There's risk in that. Model interpretation, what are the features it's scoring on, there's risk. But those companies are seeing some level of success. We firmly believe though in hiring a data science team that is trained, you could start small, two or three, and get some very quick wins. I was going to say those quick wins are essential for survivability, like digital transformation is essential, but it's also, I mean, to survival at a minimum, right? Yes. So those quick wins are presumably transformative to an enterprise being able to sustain and then eventually, or ideally, be able to take market share from their competition. And that is key for the CDO. The CDO is there pitching what is possible. She's pitching the dream. And so in order to be able to help visualize what that dream and the outcome could be, we always say start small, quick wins, then from there you can build. What you don't want to do is go nine months working on something and you don't know if there's going to be outcome. And a lot of data science is trial and error. This is science, we're testing hypotheses. And so there's not always an outcome that's to be there. So small, quick wins is something we highly recommend. So question, one of the things that we see more and more is the idea that actionable insights are perishable and that latency matters. In fact, you have a budget for latency almost. I can, in that short amount of time, the more features that you can dynamically feed into a model to get a score. Are you seeing more of that? How are the use cases that you're seeing, how's that pattern unfolding? So we're seeing more streaming data use cases and we work with some of the biggest technologies, companies in the world, so IoT, connected services, streaming real time decisions that are happening. But then also there are so many use cases around org that could be marketing, finance, HR related, not just tech related. And so on the marketing side, imagine if your customer service and somebody calls you and you know instantly the lifetime value of that customer and it kicks off a totally new talk track. Maybe get escalated immediately to a new supervisor because that supervisor can handle this top tier customer. These are decisions that can happen real time, leveraging machine learning models and these are things that, again, are small, quick wins but massive, massive impact. And so it's about decision process now and that's digital transformation. Okay, are you seeing patterns in terms of how much horsepower customers are budgeting for the training process, the creating the model? Because we know it's very compute intensive, like even Intel, some people call it like high performance compute, like a super computer type workload. Like how much should people be budgeting because we don't see any guidelines or rules of thumb for this? I still think the boundaries are being worked out. There's a lot of great work that NVIDIA is doing with GPU, we're able to do things faster on compute power but even if we just start from the basics, if you go and talk to a data scientist at a massive company where they have a team of over a thousand data scientists and you say to do this analysis, how do you spin up your compute power? Well I go walk over to IT and I knock on the door and I say set up this machine, set up this cluster, that's ridiculous. And so a product like ours is able to instantly give them the compute power. Scale it elastically with our cloud service partners or work with on-prem solutions to be able to say, get the power that you need to get the results in the time that's needed. Quick, fast. But in terms of the boundaries of the budget, that's still being defined but at the end of the day, we are seeing return on investment and that's what's key. Are you seeing a movement towards a greater scope of integration for the data science tool chain or is it that at the high end where you have companies with a thousand data scientists, they know how to deal with specialized components whereas when there's perhaps less of a smaller pool of expertise, the desire for N10 integration is greater? So I think there's this kind of thought that is not necessarily right and that is if you have a bigger data science team, you're more sophisticated. We actually see the same sophistication level of a thousand person data science team in many cases to a twenty person data science team and sometimes inverse. I mean, it's kind of crazy but it's how do we make sure that we give them the tools so they can drive value and tools need to include collaboration and workflow, not just hammers and nails but how do we work together? How do we scale knowledge? How do we get it in the hands of the line of business so they can use the results? It's that that is key. That's great Ian. And I also like that you really kind of articulated start small, quick ends can make massive impact. We want to thank you so much for stopping by theCUBE and sharing that and what you guys are doing at Data Science to help enterprises really take advantage of the value that data can really deliver. Thanks so much for having Data Science.com on. Really appreciate it. Absolutely. George, thank you for being my co-host. We want to thank you for watching theCUBE. I'm Lisa Martin with George Gilbert and we are at our event Big Data SV on day two. Stick around, we'll be right back with our next guest after a short break.