 Live from the campus of MIT in Cambridge, Massachusetts. It's theCUBE, covering the MIT Chief Data Officer and the Information Quality Symposium. Now, here are your hosts, Stu Miniman and Paul Gillan. Welcome back, this is theCUBE, SiliconANGLE's streaming media flagship platform. We're here at the MIT CDO IQ conference in Cambridge, Massachusetts. I'm Paul Gillan, my colleague, Stu Miniman. And our guest is Peter Wang. As we head into the final stretch of today's video coverage, Peter Wang is the founder and CTO of Continuum Analytics. And we're brought to ask him to come on theCUBE because of a session that he's leading later today called the D and CTO stands for disruption, not data. And that's a provocative title, certainly. But Peter, let's ask you first to just talk about quickly about what Continuum Analytics does. All right, so we are the creators of a very popular distribution of open source data science libraries and tools. It's called Anaconda, and it is downloaded, it's been downloaded millions and millions of times now. And it includes Python and R and all the core libraries you need to do data science, data analytics, machine learning and visualization in those particular languages. We also have enterprise and proprietary components that we've built on top to help businesses operationalize open source to help them manage and govern the use of open source within their environments. Because business environments are very different than what open source hackers normally deal with out in the wild. The topic that you chose for this session about disruption, not data. I mean, it sounds like you're trying to stir things up a little bit. Why should the CTO role be disruptive? Why shouldn't it just play nicely with others? Right, that's a great question. So the thing that I realized, and I think it's been very validating to hear in the sessions today and yesterday, so many people are reflecting on the fact that the CTO role is a relatively new one and it's constantly being redefined. It sometimes reports to the CTO, sometimes reports to the CIO, sometimes directly reports to the CEO. And what the CTO owns, he or she has different charter depending on different kinds of environments, federal versus finance versus pharma. And so the thing that's really interesting about the CTO role though, that in the conversation I've seen, I've heard many people reflect on, is that they don't really own all the data, right? Because data's everywhere. Everybody owns their own data. The CTO's charter really has to be about laying out strategy for managing data policy, governance, quality, all of those things in a time of great change across the IT and the computational environments. So the thesis really of my talk is that the CTO has to basically provide strategic vision around how to lean into that change, how to be resilient in the face of that change, not just how to hunker things down and stay with the old paradigms and continue building bigger and bigger and taller silos that are going to get washed out when more data comes in. But silos are a fact of life in large organizations and it doesn't seem like that's going to go away. I mean, when you're telling people that they have to take ownership responsibility for data, aren't you in effect endorsing silos? No, the difference is this, that silos, they may never go away, they may just get buried. The water level is going to rise higher and higher. And one thing that we've seen in our company, we have both product but we also do consulting and training. So we get invited to large firms all the time that are using Python and R in production. And what we see is that in many of these places, there's what I call dark data. There's data that's not in the provenance chain. There's data systems that have been set up maybe on the side, on some Linux box somewhere. There's data systems that result from a massive export from some official data system into some other thing. And actually I would say the mass of data that I've seen in these places, that has relied on production on a daily basis, those are dark data systems. And so I would say that silos are a fact of life but all the data that's leaking out of the silos is also a fact of life. And so CDOs need to be agile and they need to actually lean into that. And so that's sort of the theme of the talk. Peter, it's interesting, if you look back, technology used to be one of those things most people don't understand. Today, most people we talk to, the technology seems to be almost the easy part. It's the business processes that are some of the hard part and if we look at the CDO, it's that business process that is the main piece of what they're doing today. Right, and I think that that's the disruptions, that they don't have to go and disrupt all these technology systems. They need to disrupt the stagnant or the legacy ways of thinking about data and data governance. In the same way, I mean, IT is going through this in general, right? People talk about bringing your own device and all these other things because you just can't control anymore why not someone's going to check work email on their phone. You just can't lock all of that down as easily or as naively as I think people initially try to. So CDOs, I think, have a similar charter in terms of reforming really leading change in their organizations around mindsets and attitudes towards governance of the day. So how do you see the relationship of the CDO with the developers and data scientists and the people that are down there using all those tools and creating some of that change? That's a really great question. I think that one of the key things the CDO can do is actually be a peacemaker between those people. A thing that we like to say a lot is that data science is a team sport and it really drives to the point that silos of technology create silos in the organizational silos, which then create fiefdoms and people kind of at war with each other. And that doesn't have to be the case anymore. And so CDOs, I think, have a part of that charter of changing mindsets is actually figuring out how we can have cross-functional and sort of integrative approaches and collaborative approaches. Now that's easier said than done, obviously. Yeah, I'm curious. I think in the software world there's this thing called Conway's Law. That says Conway's Law says that your software is going to look like your organization. So can a peacemaker actually do this or do I need to restructure the organization and what's the CDOs for all of that? So Conway's Law is actually, yes, that software resembles the communication patterns within the organizations that produce them. And so I think a very similar thing can happen here is that when, because data does flow everywhere, the CDO can use that almost as a conductor of electricity and influence. And you can set data policy that encourages collaboration or you can set data policy that reinforces more of the same. If you create a policy, and usually, you know, CDOs, let's say CDOs will be very intimately involved with the analytical sandboxes that are being built out, maybe a data science center of excellence that has executive cover. So the CDO has deep connections and deep buy-in at this executive level. They can actually use that as a way to kind of get their tentacles in the organization and encourage and actually, you know, sometimes in these organizations that have such fiefdoms and such divides, all that you need is for somebody with enough cover to say it's okay to do this. It's okay to be innovative. And that's what we've seen is that there's a lot of fear of innovation that reinforces the silo at the organizational level. So I think that's one of the biggest things the CDO could do. You talk about transformation, the transforming thinking patterns. One of the big hurdles that IT organizations, I'm sure, have to get over is, is moving from operational thinking to strategic thinking. IT has been an operational function for 50 or 60 years. And now we have to start thinking about data strategically. Do you have any advice for how you can make that shift? Yeah, well, that's a very, very big topic. I would say that actually in some cases, I mean, IT has different flavors components, right? For some organizations, they a long time ago realized IT was deeply strategic. I think in the finance sector, we see this quite a bit. There is the operational IT of how you get your email and who's outlook is up and what calendar extensions you have. But then there's the IT around how do we build the fastest simulation, the fastest market data feed? How do we build all these things? In those cases, IT is life or death strategic. So I think that what organizations have to do is a very similar thing is recognizing that the way they manage data is not merely an operational concern, is not merely a risk avoidance, not merely a compliance thing, but actually by enabling agility, by enabling data science on the organization's data and getting smarter as an organization, that's a deep strategic capability. So again, we see it finance for whatever reason tends to leave the field in this area. And so we've seen that happen in finance in a massive way. We're not starting to see it in a lot of the startups, right? If you ask many of the startups in Silicon Valley, they see themselves as being data companies because the data they have on social behavior, on customer preference, that's their crown jewels. So I think those are great exemplars of what this could become. You lead a company that sells open source software, has open source software and proprietary software as well. You've been in this business quite a while. Have you seen the attitudes change within enterprise IT organizations toward the use of open source? Is it now basically mainstream or does it still have a way to go? So in the people that we talk to, it's fairly mainstream, but there's some selection bias there. I think across the board, across the board, there's definitely more acceptance of it. I think, and this is actually something I'm going to cover in my session, open really means different things. When I say open data science, I don't merely mean open source. That open actually refers to data science as a team sport. So being open to collaboration, also being open to innovation that happens outside of your firm. So having less of a non-evented here sort of approach. And most importantly, I think the key thing that, well, where I see open source having the most positive impact within enterprise IT, is places that actually have moved beyond the fear and risk avoidance and more into the strategic thinking. And when they look at the strategic thinking, they realize actually there's a lot of innovation happening out there. We cannot possibly pay for all the innovation ourselves. We have to actually tap into this global pull of innovation because if we don't, our competitors will. And I think that's really the bar that's been set. I mean, everybody's using some level of open source. Let's face it. The biggest, most successful companies in the world are relying on open source and production every day. So for anyone to sit there and say, oh, well, I don't know if I can use it because someone else might have written the code. Or you don't know who wrote the code for some closed source because it's software either, right? Everybody is using open source. Everybody's using open source. If you've done a search on the internet, you're using open source. Everyone's using open source. There's no question. I mean, it's just not even, for me, actually what's interesting is it's a really interesting signal when I'm in a customer conversation and that question comes up. It's like, well, can we really trust it if it's open source? That to me says more about the customer, actually, right? The customer actually has never even gone through that conversion into adopting all of these new age technologies. I'm curious, any specific verticals that you're seeing at the show? We've talked to a lot of government people. Of course, they've got kind of the open mandate here, very regulated in government and governance specific industries or tend to be what we see here. But what have you been seeing? So yeah, there's definitely a lot of government here. That's a little bit of a surprise to me, but given the panels and sessions, I think that's not too surprising. There's also actually a wide mix of folks, right? We've talked to pharma people. I've talked to some finance people. It's all the standard people you'd expect that would show up to look for solutions for big data management and data quality. Those are the folks that have it in spades. They have that problem in spades. Very quickly, best brisket in Austin. Best brisket in Austin? I think I make a pretty good brisket, but consensus opinion is Franklin's. Thank you, Peter Wang. Very important to answer that question. Thanks for joining us here on theCUBE. We've got a couple more guests before we wrap up today's coverage. So we'll be right back in a moment. Hi, this is Chris Devane.