 Hi, I'm Peter Burris. Welcome to Wikibon's Action Item. Once again, Wikibon's research team is assembled, centered here in theCUBE Studios in Livli Palo Alto, California. So I've got David Floyer and George Gilbert with me here in the studio and on the line we have Neil Raiden and Jim Kobielus. Thank you once again for joining us guys. This week we're going to talk about an issue that has been a dominant consideration in the industry, but it's unclear exactly what direction it's going to take and that is the role that open source is going to play in the next generation of solving problems with technology or we could say the role that open source will play in future digital transformations. No one can argue whether or not open source has been hugely consequential, as I said, it has been. It's been one of the major drivers, not only of new approaches to creating value, but also new types of solutions that actually are leading to many of the most successful technology implementations that we've seen ever. That is unlikely to change, but the question is what form will open source take as we move into an era where there's new classes of individuals creating value, like data scientists, where there's new problems that we're trying to solve like problems that are mainly driven by the role of data as opposed to code plays and that there are new classes of providers, namely service providers as opposed to product or software providers. These issues are going to come together and have some pretty important changes on how open source behaves over the next few years. What types of challenges it's going to successfully take on and ultimately how users are going to be able to get value out of it. So to start the conversation off, George, let's start by making a quick observation. What has the history of open source be? Take us through it kind of quickly. Okay, because the definition has changed. In its first incarnation, it was fixed UNIX fragmentation and the high price of UNIX system servers, meaning UNIX, the proprietary UNIXs and the proprietary servers they were built on. That actually rather quickly morphed into a second incarnation where it was, let's take the LAMP stack, Linux, Apache, MySQL, PHP, Python, and substitute that for the old incumbents, which was UNIX, BEA, WebLogic, the J2E server, an Oracle database, and an EMC storage device. So that was a collapse of the price of infrastructure. So really quickly then it morphed into something very, very different, which was we had the growth of the giant internet scale vendors. And neither on pricing nor on capacity could traditional software serve their needs. So Google didn't quite do open source, but they published papers about what they did. Those papers then were implemented. Like MapReduce. Yeah, MapReduce, Bigtable, Google File System. Those became the basis of Hadoop, which Yahoo open sourced. And there's another incarnation going, that's probably getting near its end of life right now, which is sort of a hybrid, where you might take Kafka, which is open source, and put sort of proprietary bits around it for management and things like that. Same with Clodera. This is called the open core model. It's not clear if you can build a big company around it, but the principle for most of these is the value of the software is declining, partly because it's open source and partly because it's so easy to build new software systems now, and the hard part is helping the customer run the stuff. And that's where some of these vendors are capturing it. So let's, David, turn our attention to how that's going to turn into actual money. So in this first generation of open source, and I think up until now, certainly Red Hat, Canonical have made money by packaging and putting forward distributions that have made a lot of money. IBM has been one of the leaders in contributing to open source and then turning that into a services business. Clodera, Hortonworks, MapR, some of these other companies have not generated the same type of market presence that a Red Hat or Canonical have put forward. But that doesn't mean that there aren't companies out there that are being very successful at appropriating significant returns out of open source software. Mainly, however, they're doing it, as George said, as a service. Give us some examples. I think the key part of open source is providing a win-win environment so that people are paid to do stuff. And what is happening now a lot is that people are putting stuff into open source in order that it becomes a standard, and then also in order that it is maintained by the community as a whole. So those two functions, those two capabilities are being paid by a company often, by IBM or by whoever it is, to do something on behalf of that company so that it becomes a standard, that it becomes accepted. That is a good business model in the sense that it's win-win. The developer gets recognition. The person paying for it achieves their business objective of, for example, getting a standard recognized. At volume. At volume, yes. So it's a way to get to volume for the technology that you want to build your business around. What I think is far more difficult in this area is application type software. So where open source has been successful, as George said, is in the stacks themselves, the lower end of the stacks. There are a few, and they usually come from very, very successful applications like Word, Microsoft Word, or things like that, where they can be copied and be put into open source. But even there, they have around them software from a company, Red Hat or whoever it is, that will make it successful. Yeah, but open office wasn't that successful. I mean, get to the kind of, today we have Amazon, we have some of the hyperscalers that are using that open core model and putting forward some pretty powerful services. Are they effect, is that the new Red Hat? Is that the new Canonical? The person who has made most money is clearly Amazon. They took open source code and made it robust and made it in volume. Those are the two key things that you have to have for success. It's got to be robust, it's got to be in volume. And it's very difficult for the open source community to achieve that on its own. It needs the support of a large company to do that and it needs the value that that large company is going to get from it for them to put the resources in. So that has been a very successful model. A lot of people decry it because they're not giving back and there's an argument that the Amazon for example, yes, they have relatively very few committers. I think that's more of a problem in the T's and C's of the open source contract. So those should probably be changed to put more onus on people to give back into the pool. So let me stop you. So we've identified one thing that likely is going to have to be evolved as we move forward with these two problems. Some of the terms and conditions to try to ensure that there's that quid pro quo that that win-win exists. So Jim Kabilis, let me ask you a question. Open source has been, as David mentioned, open source has been more successful where there is a clear model, a clear target of what the community is trying to build hasn't been quite as successful where it in fact is expected that the open source community is going to start with some of the original designs. So for example, there's an enormous, there's an enormous plethora of big data tools and yet people are starting to ask why is big data more successful? And partly it's because putting these tools together is so difficult. So are we going to see the type of artifacts and assets and technologies associated with machine learning, AI, deep learning, et cetera easily lend themselves to an open source treatment? What do you think? I think we're going to see open source and fairly much take off in the niches of the deep learning and machine learning and AI space where the target capabilities that are built out are fairly well understood by our broad community, machine learning, clearly. We have a fair number of frameworks that are already well established with respect to the core capabilities that need to be performed for modeling and training and deployment of statistical models into applications. That's why we see a fair amount of takeoff for say TensorFlow, which Google built and then open source because the core of deep learning in terms of the algorithms, in terms of the kinds of functions you perform to be able to take data and do feature engineering and algorithm selection are fairly well understood. So those are the kind of very discrete capabilities for which open source code is becoming standard. Now, but there's many different alternative frameworks for doing that, TensorFlow being one of them that are adjusting it out for presence in the market. But the term is commoditized. More of those core capabilities are being commoditized by the fact that they're well understood and agreed to by our broad community. So those are the discrete areas where we're seeing the open source alternatives become predominant. But where you take a TensorFlow and combine it with a Spark and with a Hadoop and a Kafka and broader collections of capabilities that are needed for a robust infrastructure, those are disparate communities that each have their own participants, committees and so forth. Nobody owns that overall stack. There's no equivalent of a LAMP stack where all things to do with deep learning machine learning and AI on an open source basis come to the fork. If someone or some group of companies and some communities are going to own that broadening stack that would indicate some degree of maturation for this overall ecosystem. But that's not happening yet. We don't see that happening right now. So Jim, I will admit my bias. I hate the term commoditization, but I want to unify what you said was something that David said. Essentially what we're talking about is the agreement in a collaborative open way around the conventions of how we perform work, the compute model, which then turns into products and technologies that can in fact be distributed and regarded as a standard and regarded as a commodity around which trading can take place. But what about the data side of things, George? We've got Jim's articulate, I think a pretty good case that we're going to start seeing some tools in the marketplace. It's going to be interesting to see whether that is just further layering on top of all this craziness that's happened in the big data world and just adding to it in the ML world. But how does the data fit into this? Are we going to see something that looks like open source data in the marketplace? Yes, and a modified yes. Let me take those in two pieces. Just to be slightly technical with hopefully not being too pedantic, software used to mean algorithms and data structures. So in other words, the recipe for what to do and the buckets for where to put the data. That has changed in a data intensive machine learning analytic world where the algorithms and the data are so tied together, the instances of the data, not the buckets, that the data change the algorithms, the algorithms change the data. The significance of that is when we build applications now, it's never done. And so you go, you know, the construct we've been focusing on as a digital twin more broadly defined than a smart device. But when you go from one vendor and you sort of partially build it, it's an evergreen thing, it's never done. Then you go to the next vendor, but you need to be able to backport some core of that to the original vendor. So for all intents and purposes, that's open source. But it boils down to the, it boils down to actually the original Berkeley license for open source, not the Apache one that everyone's using now. And remind me of the other, the other issue. The other issue is, are we going to see data sets become open source like we see code bases, code fragments and algorithms becoming open source? Yes, this is also just the way sort of Amazon made infrastructure sort of commoditized and rentable. There are going to be many data sets where they used to be proprietary, like a Google web crawl, and the Google knowledge graph of disambiguating people, places and things. Some of these things are either becoming open source or openly accessible by API. And so when you put those resources together, you're seeing a massive deflation or a massive shrinkage in the capital intensity of building these sorts of apps. So Neil, if we take a look then at where we are this far, we've got, we can see that there's, even though we're moving to a services oriented model, the Amazon for example is a company that is able to generate commercial rents out of open source software. Jim has made a pretty compelling case that open source software can be, or will emerge out of the tooling world for some of these new applications. There are going to be some examples of data sets or at least APIs to data sets that will look more open source like. So it's not inconceivable that we'll see some actual open source data. I think GDPR and some other regulations. I mean, we're still early in the process of figuring out how we're going to turn data in a commodity using Jim's words. But what about the personnel? What about the people? There were reasons why developers moved to open source. Some of the soft reasons that motivated them to do things, who they work with, getting the recognition, working on relevant projects, working with relevant technologies. Are we going to see a similar set of soft motivators diffuse into the data scientist world so that these individuals who are creating, the real ones who are creating the real value are going to have some degree of motivation to participate with each other, collaborate with each other in an open source way? What do you think? Well, yeah, good question. I think the answer is absolutely true, but it's not unique to data scientists. Academics, scientists in molecular biology, civil engineers, they all want to be recognized by their peers in some level beyond just what they're doing in their organization. But there's another segment of data scientists that are just guys working for a paycheck and they're generating predictive analysis and they're helping the company along and so forth, and that's what they're going to do. But the whole open source thing, I mean, do you remember objectory programming? Do you remember Java beans? Do you remember web services? We tried to turn developers into librarians and when they wanted to develop something, you go to GitHub, I go to GitHub right now and I say, look, I'm looking for a utility that can figure out why my face is so pink on this camera, right? I get a thousand listings of programs and I have no idea which ones work and which ones don't. So I think the whole open source thing is about to explode or already has in terms of piece parts, but I think managing it in an organization is different and when I say an organization, there's the Googles and the Amazon and so forth of the world and then there's everybody else. All right, so we've identified an area where we can see some consequential change that or where we can anticipate some change will be required to modernize the open source model, the licensing model. We've seen another one where we have to, the open source community is going to have to understand how to move from a product and code to a data and service orientation. Can we think of any others? There's one other that I'd like to add to that and that is compliance. We addressed it to some extent, but compliance brings some real world requirements onto code and data and you were saying earlier on that one of the options is bringing code and data so that they intermingle and change each other. I wonder whether that when you look at it from a compliance point of view, will actually pass muster because you need on a compliance point of view to prove, for example, in the health service, that it works and it works the same way every time and if you've got a set of code and data that doesn't work the same every time, you will probably are going to get a pushback from the people who regulate health that this is not, you can't do it that way. You'll have to find another way. Well, but that again is the same each time. So the point I'm making. This is a bigger issue than just open source. This is an issue where if the idea of continuous refinement of the code and the data. Automatic refinement. Automatic refinement could in fact, we're going to have to change some of the compliance laws. Is open source, is it possible that the open source community might actually help us understand that process? Absolutely. That's a good point. I think it's a really interesting point. Because you're right, George, that the idea of a continuous development is not something that, for example, Sarbanes-Oxley says, oh yeah, I get this. Sarbanes-Oxley is like, okay, yes, the data I acknowledge that this data is right and I acknowledge that this code or the process by which it was created is right. Now, although this is another subject, let's work this up later, but I think it's relevant here because in many respects, it's a difference between an income statement and a balance sheet, right? Saying it's good now is kind of like the income statement. But let's come back to this, because I don't, I think it's a bigger issue. But you're asserting that the open source community, in fact, may help solve this problem by coming up with new ways of conceiving, say, versioning of things and stamping things and what is a distribution, what isn't a distribution with some of these more tightly bound. What you find normally is that- I think we're going to- Go ahead, Jim. Go ahead, Jim. Yeah, just to elaborate on what Peter was talking about, that whole theme, I think what we're going to see is more open source governance of models and data within distributed development environments using technologies like blockchain as a core enabler for these workflows, for these, as it were, distributed hyper ledgers that indicate the latest and greatest version of a given data set or a given model being developed somewhere around some common solution domain. I think that will become, that those kinds of environments for governance will become critically important. As this pipeline for development and training and deployment of these assets gets ever more distributed and virtual. Yeah, by the way, Jim, I actually had a conversation with a very large open source distribution company of a few months ago about this very point. And I agree, I think blockchain in fact could become a mechanism by which we track intellectual property or track intellectual contributions, find ways to then monetize those contributions going back to what you were saying, David. And perhaps that becomes something that looks like the basis of a new business model for how we think about how open source goes after these looser, goosier problems. To guarantee integrity without going through necessarily- Very important, because at the end of the day, George, and it's always hard to find somebody to maintain it. Right, one of the big challenges companies today are having as they do open sources, they want to be able to keep track of their intellectual property, both from a contribution standpoint, but also inside their own business. Because they are very, very concerned that the stuff that they're creating is proprietary to their business in a digital sense might leave the building. And that's not something a lot of banks, for example, want to see. I want to stick one step into this logic process that I think we haven't yet discussed, which is we're talking about now how end customers would consume this, but there's still a disconnect in terms of how the open source software vendors or even hybrid ones can get to market with this stuff. Because between open source pricing models and pricing levels, we've seen a slow motion price collapse. And the problem is that the new sort of go to market motion is actually made up of many motions, which is discover, learn, try buy, recommend. And within each of those, the motion was different. And you hear like, it's almost like a reflex, like someone, you know, when your doctor hit you on the knee and your leg kind of bounced, everyone says, oh yeah, we do land and expand. And land was the discover, learn, try, augmented with inside sales. The recommend and sort of standardizes traditional, still traditional enterprise software where someone's got to talk to IT and procurement about fitting into the broad architecture and infrastructure of the firm. And to do that, you still need what has always been called the most expensive migratory workforce in the world, which is an enterprise sales force. But I would suggest that there's a big move towards standardization of stacks. True private cloud is about having a stack which is well established. And the relationship between all the different piece parts and the stack itself is the person who's responsible for putting that stack and maintaining that stack. David, I'm going to pretend that you're a CIO. Are you going to buy open stack or are you going to buy the VMware stack? I'm going to buy VMware stack. What does that say about open source? Well no, the point I'm saying is that those open source communities or pieces would then be absorbed into the stack as an OEM supplier, as opposed to a direct supplier. And I think that's true for all of these stacks. If you look at the stack, for example, and you have code from NetApp or whatever it is that's in that code and they're contributing it, you need an OEM agreement with that provider. And it doesn't have to be necessarily open source. Bottom line is this stuff's still really, really complicated. But this is also very complicated. That business model of being an OEM provider is very different from growing an enterprise sales force. You're selling something that goes into the cost of goods sold of your customer and that cost of goods sold better be less than 15% and preferably less than 5%. Well, your point is that if you can't afford a sales force, an OEM agreement is a much better way of doing it. You have to get somebody else's sales force to do it for you. So look, I'm going to do the action item in this. I think that this has been a great conversation. Again, David, George, Neil, Jim, thanks a lot. So here's the action item. Nobody argues that open source has been important and nobody suggests that open source is not going to remain important. What we think based on our conversation today is that open source is going to go through some changes and those changes will occur as a consequence of new folks that are going to be important to this, like data scientists, to some of the new streams of value in the industry may not have the same motivations that the old developer world had. New types of problems that are inherently more data oriented as opposed to process oriented. And it's not as clear that the whole concept of data as an artifact, data as a convention, data as standards and commodities are going to be as easy to define as it was in the code world. As well as ultimately IT organizations increasingly moving towards an approach that is focused more on the consumption of services as opposed to the consumption of product. So for these and many other reasons, our expectations is that the open source community is going to go through its own transformation as it tries to support future digital transformations, current and future digital transformations. Now, some of the areas that we think are going to be transformed is we expect that there's going to be some pressure in licensing. We think there'll be some pressure in how compliance is handled and we think the open source community may in fact be able to help in that regard. And we think very importantly that there will be some pressure on the open source community trying to rationalize how it conceives of the new compute models, the new design models. Because where open source always has been, it's very successful, is when we have a target, we can collaborate to replicate and replace that target or provide a substitute. I think we can all agree that in 10 years we will be talking about how open source took some time to in fact put forward that TPC stack as opposed to define the true private cloud stack. So our expectations and open source is going to remain relevant. We think it's going to go through some consequential changes and we look forward to working with our clients to help them navigate what some of those changes are, both as committers and also as consumers. Once again, guys, thank you very much for this week's action item. This is Peter Burris and until next week, thank you very much for participating on Wikibon's action item.