 Live from San Francisco, it's theCUBE, covering Spark Summit 2017. Brought to you by Databricks. You are watching theCUBE at Spark Summit 2017. I'm David Goad here with my friend George Gilbert. How you doing, George? Good. All right, but the man of the hour is over to my left. I'd like to introduce a Databricks partner and his name is Octavian Tanas. He's the SVP for Data On Tap, Software and Systems Group at NetApp. Octavian. Thank you for having us. All right, well, you have kind of an interesting background. We were chatting before. You started as an engineer, developer? Yeah, so I'm in an executive role right now, but I have an interesting trajectory. Most people in similar role come from a product management or sales background. I'm a former engineer and somebody that has a passion for technology and now for customers and building interesting technologies. Okay, well, if you have a passion for this technology, then I'd like to get your take on the marketplace a little bit. Tell us about the evolution of the mainstream and what you see changing. Well, I think data is the new currency of the 21st century, right? And you have a desire and a thirst to get more out of your data. So you have developers, you have analysts looking to build the next great application to mine data for great business outcomes. So NetApp as a data management company is very much interested in working with companies like Databricks and a bunch of hyperscalers to enable that type of solutions that either enable in-place analytics or data lakes or solutions that really enable developers and analysts to harness that part of the data. Mm-hmm. So maybe walk us through what you've seen to date in terms of the mainstream use cases for big data and then tell us where you think they're going and, you know, but what walls need to be pushed back with the constituent technologies to get there. So originally what I've seen a lot of people investing in data lake technologies. Data lakes in a nutshell are massive containers that are simple to manage, scalable performance where you can aggregate a bunch of data sources and then you can run, you know, a map-reduced type of workload, you know, to correlate that data, to harness that part of data to draw conclusions. That was sort of the, you know, the original track and over time I think there's a desire given how dynamic and diverse the data is to build a lot of this analytics in line, in real time, right? And that's where a company like Databricks comes and that's where the cloud comes to enable both the agility as well as the, you know, the type of real-time behavior to getting those analytics, right? Now this is your first Spark Summit? Absolutely, happy to be here. Well I know it's just the first day but what have you learned so far? Any great questions from other participants? Well I think I see a lot of people innovating very fast. I see both established players paying attention. I see new companies looking to, you know, to take advantage of this revolution that is happening, you know, around data and data services and data analytics. Maybe tell us a little more what we were talking about before we started about how some customers who are very sensitive to their data want to keep it in their data centers or, you know, or equinex with still counts as pretty much theirs but the compute is often the cloud somewhere. So as you can imagine, we work with a lot of enterprise, you know, customers and one thing that I've learned in the last couple of years is that their thought process has evolved. You know, banks, you know, large financial institutions two years ago were not even considering the cloud and I see that now, you know, changing and I see them wanting to operate like a cloud provider. I see them want to take advantage of the flexibility and the agility of the cloud. I see them being more comfortable with the type of, you know, security, you know, capabilities that the cloud offers, you know, today. So security has been, you know, probably the most troublesome issue that, you know, folks have looked to overcome and then the gravity of the data. The reality is that the data is very distributed and dynamic, diverse in nature as I mentioned earlier. There's data created at the edge, data created in the data center and people want to be able to, you know, process that data in real time regardless where data is without necessarily having to move it in some cases. So everybody's looking for data management, you know, solutions that enable mobility, you know, governments management of that data and this enabling analytics, you know, whatever that data is. There's a, you said some really interesting things which is, I mean, I can see where, like the customer's data center, where they're, you know, extended to Equinex where they want to bring the compute to the data because the data is heavier than the compute. But what about on the edge? You know, does it make sense to bring, is there enough data there to keep it there and bring compute down to the edge or do you co-locate compute, you know, persistently? And then how much of the compute is done at the edge? The reality is that you're probably going to see, you know, customers do both. There is more data created at the edge than in the history, you know, before. And you'll see a lot of the data management companies invest in software defined solutions that require very small footprint, both from the storage point of view as well as, you know, compute. One of the, you know, advantages of a technology like ONTAP is the investment that has been made to enable data reduction because, you know, your ability to store data at the edge is not really, you know, very good. So you want to have these capabilities to reduce the footprint by compression, by the dooping, by compacting that data and then making some smart decisions at the edge. You know, perhaps do some in-line in-place analytics there and moving some of the data, you know, back into a central data center where, you know, more batch analytics, you know, can take place. But when you talk about that compaction, de-duping, there was one more, but I think everyone gets the point. Are you talking about having a NetApp ONTAP device near the edge or at the edge? And that device, it's actually software only. You guys probably are aware of the fact that ONTAP now ships in three flavors or three form factors. There is an engineer appliance and we will likely do that for many years to come, but we also have ONTAP running in a virtual environment, either on KVM or VMware, as well as ONTAP running in the cloud. We've been running in the AWS cloud since 2014. We're also running in the Azure cloud. We are talking to other vendors to improve the ubiquity of software-defined ONTAP. So just to be really specific, we're told now that an edge gateway, not an edge device, but gateway is about two gigs in memory and two cores. Is that something a software-defined ONTAP would run on? Absolutely, right? You'll see us running on a variety of devices in the field with energy companies. You'll see ONTAP running in the tactical sphere and we have projects that I can really tell you about, but you'll find it broadly deployed on the edge. Okay, that's... Yeah, talk a little bit about NetApp. What are some of the business outcomes you're looking for here? Do you have good executive sponsorship of these initiatives? So we are very excited to be here. NetApp has been in the data management realm for a very, very long time. And analytics is a natural place, a great adjacency for us. We've been very fortunate to work with NoSQL type of companies. We've been very happy to collaborate with some of the leaders in analytics such as Databricks. We are entering the IoT space and enabling solutions that are really edge-focused. So overall, this is a great fit for us and we're very excited to participate at the summit. So what do you think will be... We've heard from Matei that sort of the state of the art in terms of, I hate to say the word, it's fantasy, but like experimentation perhaps is structured streaming, so continuous apps which are calling on deep learning models. Where would you play in that and what do you think, what are the barriers there? What comes next? I think any analytics, complete analytic solution will need a bunch of services and some infrastructure that lends itself for that type of a workload, that type of a use case. So you'll need, in some cases, very fast storage with super low latencies. In some cases, you will need tremendous throughput. In some cases, you will need that small footprint of an operating system running at the edge to enable some of that inline processing. So I think the market will evolve very fast, the solutions will evolve very fast and you will need the type of industry sponsorship by companies that really understand data management and that have made it their business for a very, very long time. So I see that synergy that is being created between the innovation and analytics, the innovation that happens in the cloud and the innovation that a company like NetApp does around the data fabric and around the type of services that are required to govern, to move, to secure, to protect that data in a very cost-efficient way. Okay, this is kind of key because people are struggling with having some sort of commonality in their architecture between the edge, on-prem and the cloud, but it could be at many different levels. What's your sweet spot for offering that? I mean, you talked about, you know, deducing and... Compression and compaction. Compression and snapshots or whatever. So having that available in different form factors, what does that enable a customer to do, perhaps using different software on top? So I'm glad that you asked. The reality is that we want to enable customers to consolidate both second and third platform applications on the on-tap operating system. And customers will find not only flexibility but consistency on the data management regardless of what data is, whether it's in the cloud, near the cloud or on the edge. So we believe that we have the most flexible solution to enable data analytics, data management that lends itself for all these use cases that enable next generation type of applications. Okay, but is that predicated on having not just data on tap but also a common application architecture on top? I think we want to enable a variety of solutions being based there. So in some cases we're building glue. What do I mean by glue? It's for example, an NFS to HDFS connector that enable that translation from the native format for most of the data in a Hadoop or Spark type of EMR system. So we're investing in enabling that flexibility and enabling that innovation that would happen by many of the companies that we see here on the floor today. Okay, that makes sense. We have just a minute to go here before the break. So if you could talk to the entire Spark community, and you are right now on theCUBE, what's on your wish list? What do you wish people would do more of or you could get help with something, what would it be? Well, I think my ask is continue to innovate, push boundaries and continue to be clever in partnering both with small vendors that are really innovating with tremendous space as well as with established vendors that have really made the data management their business for many years and are looking to participate in the ecosystem. Let's innovate together. All right, Octavian, thank you so much for taking some time here out of your busy day to share with theCUBE and we appreciate your... Very good. Thank you so much. Pleasure. Thank you. You're watching theCUBE here at Spark Summit 2017. We'll see you in a few minutes with our next guest.