 All right, so we talked about the success of open source, about GraphQL and APIs, and a faster, better way to manage APIs. We talked about continuous delivery. We wouldn't be complete without artificial intelligence. So our next speaker is a Senior Research Fellow of AI at Uber Labs. He is also an Associate Professor of Psychology and Computer Science at Stanford University. And he has released several open source projects, including Church, WebPPL, and Pyro. Today he's going to join me to do a QA on open source and AI. So please welcome to the stage Noah Goodman. All right, artificial intelligence. So there's a lot of folks here in the audience who are technologists, but are also making decisions about how to make sense of AI, ML, and what technologies they should or shouldn't adopt. I think one of the biggest questions I always get is, A, what is the state of this category of computing, this segment of computing, and open source? And then across all the different technologies, you're working with us on our deep learning foundation. Uber has contributed, Horwal, other projects into that. But people always ask me, when are we going to see some standardization around here? What's the state of all this code? Help the audience understand where are we at in terms of the technology and the use of this technology? Yeah, that's a great question. We're definitely at a very early stage. There's a lot of rich chaos. It's moving fast, and it's really exciting. Narrowly speaking, I would say there's some amount of convergence that's already happening. It's not necessarily a reduction of the number of different things you can use, but there's much more interface compatibility that's emerging. The fact that NumPy and PyTorch use very, very, very similar interfaces is kind of an example of that. I will say that I think it's going to take a while for the dust to settle. Actually, this morning when I was hearing you talk about the amazing success of Linux, I was remembering myself 20 years ago as a grad student building a Linux kernel from scratch over and over and over. And it was awesome, but only if you were extremely technical and had a lot of time on your hands. And I think actually, AI is a little bit at that stage where the power is there, but it takes a lot of work, a lot of time. And I think over the next five years what we're going to see, hopefully it'll be less than 20 years, what we're going to see is that there's convergence and the tools become much faster to where you can see the same tools across many different applications in a much more reliable turnkey way. Yeah, one of the things in the deep learning project of the Linux Foundation, there's a variety of different efforts. Again, you contributed Horvald Pyro, but you look at things like Accumose from, which came out of AT&T, but is a tool that's, I'm trying to make the use of data, the packaging of it and deployment of models and so forth just easier for a general practitioner. And again, early days in that project, early days in the projects that you're working on from Uber, but like, give us a little more detail on what we need to do to make AI ML technology easier for being able to be implemented by a broader community. Yeah, there's really two sides of this, which is why it's so difficult, because one side is exactly the things we've been hearing a lot about. I mean, I love the continuous integration thing because I think continuous integration and really solid testing is one of the biggest contributions of open source community to AI software. And so I think like, I don't have a lot to say that's really novel beyond do what those guys are doing only in AI, right? But the other side, and I think the thing that complicates it is that AI is still very much open research and the ideas, the kind of very foundational mathematics and ideas are moving really fast. So there are some things that exist and we know they exist and they're appropriate to start making it easier to use. But there's other things that we just thought up yesterday. There have been 30 papers on the archive, which is the main source of all AI knowledge these days, but they were all in the last three seconds, right? And so I think it's very hard to know how to make things easier when you're not quite sure what are the things that are really successful and whether they're gonna be supplanted by a new unification or new ideas tomorrow. And so it's this kind of bootstrapping process of taking things that seem to be working and building tools to make them successful and hoping that you haven't committed yourself to research ideas that are gonna be replaced with something else the next day. How did the decision around using and leveraging open source at Uber, for example, like what from your perspective was the value of saying, like we wanna put this stuff out there, put it in a neutral home, have a solid governance structure behind that? You know, what role does open source play in the AI ML world? So to me, let me take one step back and describe kind of what I'm thinking about the role of open sources. So I've been thinking a lot about highways recently, not because I work at Uber and there's cars, but because the interstate highway system, the Eisenhower Highway System, is this amazing example of a common good that could not have been created by any one company or really private organization. It was created collectively, in that case, sponsored by the government, but it radically increased the productivity of the US economy for decades after it was created right after World War II. So that's an example of something that we had to have done collectively, but it provides infrastructure for all of us. So then you might ask, okay, what's the equivalent of that now? And I actually think these projects, these big open source projects are exactly the equivalent of that. There are things that we can't do individually, even in big companies like Google and Uber, but we can do collectively. And by doing it collectively, everybody who is going to adopt it gains a huge productivity boost. And you might wonder, okay, well, Uber is big and energetic, why couldn't they just do this internally? And I think the answer is that in general, but especially in AI, you need multiple viewpoints and multiple contributors to make sure that the software is exercised and tested. You need people contributing from a lot of different applications so that you don't kind of, in machine learning speak, overfit to one thing. And we've really seen that in the Pyro project that I've been involved with in other open source AI projects. The kind of the community input that has helped us make really solid software that we can use and that can be used externally. Also the fact that there's a huge synergy. The Pyro project is an example where it's an open source software project that exists only because we get to build on another open source software project, PyTorch from Facebook. And so this kind of collective building, I think is why we just have to be doing it this way. One of the things, as we're seeing a lot of major technologies kind of coming out, 5G, autonomous driving, the kind of edge computing with low latency out there. Like give us a sneak peek of the implications of AI technology that is it gonna be enabled by those adjacent kind of breakthroughs from both the network connectivity side and then in the autonomous world. I don't have anything to say about network connectivity, I'm afraid. Those, that chart I showed this where you're like, nobody understands networking. Like there's a bunch of people here who do understand it. So I'll restrict myself to something simple like artificial intelligence. Great. Yeah, so predicting the impact of any technology is hard. I think AI is interesting because by its very nature, predicting what it will do is much harder, right? Artificial intelligence is fundamentally. But I thought AI was about prediction. Well, yeah, exactly. So you don't know what your prediction is gonna predict until you've done it. That's exactly the point, right? AI is about making better decisions, making them from more data, making them more rapidly, making more of them, making them more integrated. As you might imagine, better decision making is potentially a game changer for everything. Everything that humans do. And so because of that, and because we don't know yet which decisions are going to be made much better and which decisions are going to be made only a little better, we don't actually know fully what AI is gonna do. I think this is both why everybody's paying attention and there's a lot of hype. And also partly why there's too much hype. So AI is going to change everything. My own opinion is that it's gonna change everything in sort of the way that a power saw changed things. They had hand saws, they could cut things. All of a sudden when you have a power saw you can cut a lot more things. And at first you just cut more things. But pretty soon you realize you can cut things differently. And AI is gonna be like that. It'll first change the way we do specific tasks that we already do. It'll then grow out into the rest of our software development process. Some people call it software 2.0 which is catchy but a little bit over. And pretty soon we'll just come up with new things that we can do with it in almost every application. So I think what people have to do is keep track of where it's going and not imagine that they can predict right now what it's gonna do. Yeah, I wanna talk about all the different moving parts. ML, DL, models, data, reinforcement learning, natural language processing, training, et cetera. When you look at all the different components that need to come together, any one of those things is complex. There's all this talk of data is the new oil and how people will be able to sort of hoard that aspect. Across those different components of AI, what's your thought on where are we doing good? Where can we do better? It's just so many components. I think a lot of people here in the audience are trying to figure out what is my strategy? Where do I invest? Yeah. It's funny, you say all of those acronyms and I think, oh yeah, all the pieces of AI. Whereas I heard all the things about continuous integration and I thought, wow, look at all those complicated pieces. How can I keep track of all of that stuff? It's in the eye of the behold. Exactly. But your more general question, I think the answer is maybe the good news is that five years ago it used to be that the different subfields of AI were very different and were powered by very different statistical techniques and technologies. There's been actually a remarkable amount of convergence around first ideas from deep learning, but also it turns out deep learning is not orthogonal to statistical techniques, so the ideas of probability, deep learning. Those basic tools are behind most of the advances in AI. And I think it's not at all accidental that the arising of really important open source projects has actually coincided with AI gaining visibility. And I think the causality is actually, let me say it differently. People talk about why is AI happening now and they talk about compute and data. Right. And to a large extent, that's very true. The algorithms have not changed radically in recent years, small advances. But I think they're missing something else, which is software. Something that has actually made the world of modern AI proceed much more rapidly to allow much more substantial software engineering projects is that there are these frameworks that build on also kind of old ideas like automatic differentiation, but they're systematic frameworks that allow you to rapidly build much more complicated AI software projects, also that allow you to deploy to hardware that otherwise is extremely hard to target, like GPUs. And so I think the compute is moving along doing its own thing. The data is there and everybody knows that they need to grab their own data and hopefully share it. I think the point of most leverage for investment is actually the software. And it's sort of understanding which software tools are going to allow the most systematic software development that's integrated into the rest of your software, the rest of your engineering. And there are some that everybody knows, oh yeah, that's a good thing for AI. There's PyTorch, there's TensorFlow, but what's emerging now is a whole bunch of a very large ecosystem of kind of layers that are integrated or on top or useful for deployment and monitoring. Just building one, say deep neural net doesn't actually solve the problems that you have of deployment integration, monitoring, testing, evaluation, but there's a big ecosystem growing for that now. And I think actually the Deep Learning Foundation is a kind of awesome way to foster that and focus the energy. Absolutely, I hear every day people wanting to see more consolidation around these frameworks and help give guidance. I want to come back to the data question. You said everyone, you hope everyone shares. About a year and a half ago, the Linux Foundation worked with our member companies, attorneys, to create an open data license. So the concept was let's kind of similar to how open source licenses have allowed for the smooth sharing of code from an IP perspective. Let's try and apply those same practices to data where you have sort of a copy left license where there's a share and share-like type of function and then more of a permissive license, but two main license for data sharing. How do you see the world of data sharing evolving? Are we gonna see more hoarding? Are we gonna see networks of data sharing? Like the Uber Urban Compute thing I saw this morning is just very interesting. Where you hear you have this company that's Uber who has really important data that's being anonymized and shared with regulators. I think that's super interesting. Are we gonna see more of that? Less of it? What's your thoughts? I hope very much that we see more of it. I'm worried that we'll see less of it. So I think there's a big pressure right now. People realize that data are fundamental. In some ways, they're more hoardable than software because software now needs to be built collectively for it to function properly. We've got that part right. Software is more hoardable. It's like, I'm a dragon sitting on my bits. I think it could go the direction which sadly it has been going that each company thinks, okay, my gold mine is my data and I'm not sharing this. Hopefully we can switch the worldview to start to think of data as something that is, one, a common good. Something that belongs to humanity much more than to individual companies. And two, something that is going to, much like the Interstate Highway System, allow us to do more if we pool it in a kind of vastly super linear way. It's not just okay, we'll share and I'll get the benefit of my data and your data. It's that if we all share, the benefits we get are vastly bigger. Yeah, you're starting to see this in some of the cybersecurity realm where people are sharing sort of attack vector data, which if you do share it, then you can do AI and predict analytics on how to stop those much quicker than if you were just trying to hoard all that data yourself. I mean, there probably are gonna be companies that try and control that network externality and profiteer from it, but what we're seeing a lot is coordination instead of hoarding there. It seems like you're seeing some of that, but you'd like to see more. Yeah, and I'd like people to come to believe that if you take two different piles of data that have very different information in it and you put them together, you get something much better than either one alone would give you. And that the only way that that's actually going to work out is not bilateral agreements, but something more like an open data, right? Open sharing. Yep, yep. So I'll give you the last word. This is the obligatory question, which is everyone is also concerned about how AI will affect how we live every day and will make us all jobless. I often tell Linus and Greg Crow Hartman and the Colonel folks that as soon as we can get a self-creating AI tool for software code that they're out of a job, but it seems like all that stuff's pretty far-fetched. What are your thoughts? Yeah, I think self-creating AI is pretty far down the road. I think even general human level AI is quite far down the road, in my opinion. I think what's close is AI as power tools, right? Like the Saw example. So I think it does matter a lot when a brand new tool comes along and when it's a power tool instead of a hand tool, it changes what you can do and how you do it. And that will result in dislocation, people who only know how to use hand tools and all of a sudden there are power tools. They've got to do something else. I don't think it's going to replace all of humanity and all of work, but I think it'll change a lot of things. And we need to think actually carefully as a society about what to do about that, hopefully planning ahead instead of simply reacting. I'm hoping actually that large organizations like the Linux Foundation and Human Centered AI Initiative that were starting at Stanford, which are not part of companies themselves can form the nexus and the middlemen for thinking about these questions and making choices that are pro-humanity ahead of the impact of AI and the potential disruptions of these new tools. Yeah, well that's certainly our goal. And we really appreciate the work that you've done with us on the LF Deep Learning Organization and just the work you're doing in general. So thanks for coming and sharing your thoughts today. And let's give a round of applause here for Noah. Pleasure.