 Hi, everyone. It's really, really great to be here, very humbled to be talking to this group in particular. And thanks, Jim, for a really, really nice introduction. That's not what I'm going to say. Anyway, so what I want to say is I want to talk about open source at Docker and, more specifically, how the Linux model of open source helped us at Docker deal with the scale of our open source operations, what we learned in the process, and I'm hoping along the way to extract a few lessons that maybe others can benefit from. So first of all, I want to talk a little bit about what we're trying to do at Docker, and I'm actually not going to talk about containers a whole lot because I assumed everyone here pretty much knew that Docker does containers. But I want to talk about why we're doing it and kind of what's our purpose. And it really boils down to one small sentence. We're trying to make the internet programmable. And I want to unpack that because it's kind of a big sentence. First, we believe in something we call tools of mass innovation. In other words, we believe that the more people get to create and invent new things and share their inventions with the world, the better off society as a whole will be. And we believe that in order to do that, people need tools that are specifically designed to help them create and innovate more and do that at a very large scale. That's what we mean by tools of mass innovation, and that's what we're trying to build. Second, we think there's one of those tools of mass innovation in particular that's extremely promising, has a lot of potential to really help make a lot of people create a lot of great stuff in the coming years, and that's the internet. And specifically, making the internet programmable. And of course, the internet has been around for a long time, and it's an awesome communication network. But we think we're seeing that over the last years it's becoming more than that. It's really becoming a programmable platform. And of course, we've heard the internet as the computer for a long time, but it's actually becoming true now because there's so many different devices coming online that can do so many incredible different things. And more generally, there's almost every aspect of human society now has a really deep and powerful programming interface to it that's available online to pretty much anybody. And the result is you can create applications that really take advantage of that and operate at the scale of the internet and do amazing things. And we know this already because we consume these applications, the apps and cloud services that we use every day would not fit in the scope of any single computer anymore. It's almost intuitive and obvious for me to say that. But still building these applications is the privilege of a relatively small number of people in the world. And we think it's just too awesome and too important of an invention to be able to program the internet to limit that to a small number of people. So what we're trying to do is make that accessible to more people, make it as seamless as possible. So the results, the way we're doing that is by building a stack. And this is a little schematic of the Docker stack. It's pretty simple. There's commercial products on top. So we help businesses solve their specific problems using this technology. And the main problem we solve is helping businesses create what we call a digital supply chain. So if you're a business and you're creating and distributing applications, you need to have lots of different teams and lots of different systems and locations in the world connected in a pipeline in the same way that for physical goods, you need a worldwide supply chain. The same is true for software. So we help businesses deal with that. And those commercial products are built on top of a development platform. So tools for developers and environments that help programmers create and share their applications as seamlessly as possible. And we try to do that both for hobbyists and professionals. And that platform is built on top of an open infrastructure. And that's both codes and standards that are developed in the open by a lot of the people in this room, right? Linux is an important part of that infrastructure. And that's the stuff that's invisible to almost everybody. But really without it, the internet would not work at all, let alone be programmable. And so that infrastructure is extremely important. And we rely a whole lot on it to build our platform and every opportunity we have to contribute back what we take. So the result of all that is a product, Docker, that a lot of people use and still today it's growing pretty fast. And just to give you a sense of the growth, I made a little chart, got some help because I can't design that well. So in 2014, so what we're measuring here is the number of what we call the pools. So that's the number of containers that have been downloaded from a service we call Docker Hub, which is the place where anybody can share and consume containers to assemble in their application. So that number is a good proxy for how many people use Docker and now actively they're using it. So in 2014, we reached a million pools. 2015, we reached a billion pools. And this year in 2016, we passed through six billion pools and I think at the moment we're increasing that number by one billion every six weeks. So that's a lot of containers being downloaded by a lot of people. So by any reasonable measure, Docker is something that a lot of people use. And so of course we ask ourselves a lot the question of how did we reach that level of growth? And there's two reasons. The first reason is selfishly, we wanna keep it going. We think it's good that more people use Docker and we think we're just getting started. There's a lot of people that could use it for a lot more things. And knowing how we did it in the first place will help us keep it going. And the second reason we wanna understand how is so we can share what we learned so that more projects can benefit from those lessons and grow faster because we think in the end, like Jim was saying, everybody benefits including us. So one reason, sorry. The main reason we think Docker has grown at that pace is because of open source. And a lot of people are saying, well obviously. And just for the give context, we're a relatively small company. It's 250 people at Docker. And when we started, when we launched this current iteration of our project three years ago, it was 30 of us. So a small company with a really big goal and for a company like that, it wouldn't be possible at all to achieve our goal without open source. Docker would just not be possible without open source full stop. And that's because it's just not enough of us to solve all the technical problems we need to solve to make the internet programmable. It's impossible. There's a lot of smart engineers at Docker but there's just not enough of them. So with open source, we get to attack these technical problems, share the result of our work with others and invite everybody else in the community to contribute and reuse and improve these components. And as a result, we get to use the results to solve our problems. And everybody else gets to reuse the results to solve their problems and everybody wins. So I'm not gonna explain open source to you but the point is it's really crucial for us and it's crucial also for our growth as a product and as a company. And to give you a sense of the scale of our open source operation. So today we've open sourced about 50 repositories. So these are projects larger and small that we've opened up in the process of building Docker. About 2,300 people contributed to one of those and currently we process 1,200 patches a week. So that's a lot of patches. We call it the fire hose. It really feels like drinking from a fire hose and for those of you who are working on Linux, you know the feeling. So this is not as large as Linux but it's closer to Linux than 99% of projects out there and it's just a different scale of project for most projects. And so when we started dealing with the scale we had this problem of, okay, how do we deal with this? So we had to look for help and examples from successful projects and of course the obvious idea was to borrow from Linux. So we ended up using a whole lot of ideas and rules and tricks from the Linux project this is not a complete list but to give you a sense we borrowed the BDFL system, the maintainer system, the release, the combination of relatively rapid time-based releases plus stable interfaces which allows you to move fast but not break applications as you move. The separation between project maintainer and employer, right? You have an employer, you have a role in the project both are important but they're separate and it's good that way. So there was a whole lot of aspects, the DCO, the legal framework for protecting contributors, et cetera. We borrowed a lot from Linux but we also changed quite a few things and there's one primary change that we made and the resulting model is what we call the Linux model with a twist and the twist is, it's pretty simple to explain, Linux really started with the plumbing, right? The Linux kernel is a core component and then as the plumbing improved different finished products emerged over time to serve different kinds of users. So we have distros focusing different aspects of what you can build on top of Linux and with Docker we did it exactly the other way around. We started with the finished product. The goal of Docker is to solve problems for users and then along the way as we do that we take opportunities to extract open source components and open them up and the whole thing is done in the open but the mindset is very different and in fact that model is closer to what you'll find in large tech consumer companies. So if you think of Apple, Google, Facebook, et cetera, all these companies have their own users and their own products and they're laser focused on solving problems for their users and in the process of doing that they end up employing a lot of smart engineers to solve hard problems which they need to solve and along the way they carve out implementations, components and open that up and invite the community to join and then that open source, that open innovation process kicks off. So that's how you end up with projects like LLVM, Chromium, et cetera. That's the model we're following. So in the process of helping as many programmers and businesses program the internet we ended up carving out components like Libcontainer, Swarmkit, Notary and 50 more. So that's the best summary of our model and if you look at the timeline of that here what you'll see is two different things. There's a few examples of problems we've solved with Docker along the way. So features we added to the product and at the same time, timeline of different components we've open sourced. It's not a complete list but it should be enough to give a general sense and you'll notice that there's a correlation there. So as we solve different problems we've had to open source different components along the way. So for example, early on one problem we had to solve is the problem of container provenance origin. So if you want to deploy containers you want to know where they come from and you need some sort of a cryptographic verification of that and so in solving that problem for our users we ended up implementing a component called Notary which provides all the primitives for content trust and we open sourced that as a separate loosely coupled components and now anyone else can build their own platform and using the primitives in Notary. More recently we introduced built-in orchestration in Docker in the latest version 1.12 and that solved the problem of a lot of users saying well, I want to run containers not on one computer at a time but on a whole swarm of computers. So how do you... Hello? Yep, okay. Is it something I said? How do I do that on a whole swarm of computers and how do I do that easily out of the box? And so in the process of solving that problem we ended up open sourcing a component called SwarmKit which does the same thing it provides all the primitives and so on. So you get the general idea. So I want to go into a little bit more detail in that model and take an example but really we think this approach of solving user problems first with a finished product which is Docker and then open sourcing components along the way following the Linux model as closely as possible for these components. That's really the key to how so far we've managed to keep that cycle going that Jim was talking about. So to take an example a little case study hopefully I won't run out of time. One problem this year that we've dealt with a lot is kind of a twin problem from developers and administrators. So it has to do with using Docker with your host's platform of choice in a way that feels native. So developers tell us I use for example a Mac and replace Mac with your desktop platform of choice and it doesn't feel native. So I use a Mac and I use Docker but there's a lot of glue I need to put together to make storage work properly and networking and I have to install a separate hypervisor and it's just kind of a pain I just want to develop on Docker. Can you help me? And on the ops side we get a lot of feedback from administrators saying well I use this or that cloud platform in this example I took AWS but it's the same thing for every cloud platform and the same thing it doesn't integrate natively out of the box with my storage and networking and authentication and I have to add all this glue and it just feels like a waste of my time. Can you help? So we got that request a lot and we ended up on the top of our list and we looked for a solution and the result is two products that are in beta currently called Docker for Mac and Docker for AWS and it's exactly what it sounds like. It's Docker optimized and adapted to the Mac and Docker optimized for and adapted to AWS. So it's the same portable experience you deploy your container-based applications you manage them, et cetera, with a portable interface but at the same time you get to install it, upgrade it attach it to your storage and networking in a way that feels very, very native and so it's not a new feature but it actually solves a problem that a lot of our users have. So this is the moment where I try and give a live demo and I didn't do any offerings to the demo gods so it may or may not work but well, is it, can I give a demo? You wanna see a demo or no? Yes? Okay. Okay, so let's switch to the computer. How much time do I have? Okay. Amazingly I'm not running out of time. So this is my Mac and I wanna develop on it using Docker so I'm going to get Docker for Mac and I'm gonna run it and notice here, I don't know if you can see but there's a little whale that appears in the menu and it's kind of moving a little bit to indicate that things are getting set up and so just to explain what's going on here in this application it's absolutely everything you need to run Docker on your Mac there's no separate hypervisor or Linux host you need to install in this one Mac application there's a hypervisor, the Linux hosts integrated in there's network and storage drivers that are designed to hook into the Mac system so basically you have everything you need and normally after having double clicked I'll have to do is open a terminal and if I type Docker info I've got a working Docker engine and if I wanna run say Redis actually is this readable? Not really, huh? Better? Okay, well really big, okay. You know I wanna run Redis and here we go, we got Redis running so I just went from zero to Docker in like 30 seconds and that's Docker for Mac and notice there's no bells and whistles no feature just easier to use Docker and so let's say I wanna develop my application so I've got a little test application here Docker service creates this is really cool this is public by the way, it's in Docker Hub it's one of the six billion downloads I guess I'm just gonna expose port 80 and I'm gonna scale it replicas 10 replicas okay, so now I've got whoops, a service running you can see it's scaling to 10 replicas so now if I open my browser and I go to local hosts I've got my little test application running and you can see it's serving different requests and different containers, et cetera so boom, I just went from zero to having Docker and developing an application locally in my Mac and I didn't have to do anything, right? So that's the first part of the demo and it worked, okay, so far so good. Second part, now I wanna use AWS as my example to go to production and I have it all set up I'm a certified AWS admin I have my account, I spent like weeks setting everything up and it's really simple and so now I want to create a swarm and deploy this application in production so what I'm gonna do now is there's an optional feature in Docker from Mac which is really cool, which I use and you can open that little menu here and you can say create swarm and here you can say create swarm on AWS so if I do that, it's gonna send me to my console on my familiar Amazon accounts and I'm gonna choose Frankfurt as a region that seems appropriate and for those of you who are familiar this is a cloud formation template so it's the most native possible way to deploy stuff on AWS and here I can choose the number of nodes, instance size, let's go crazy not completely crazy but I can use my usual SSH keys so that's an example of tight integration I don't have to come up with a completely new authentication system and here optionally I can register my swarm and Docker cloud and I'll explain what that means in a minute so I'm actually, I won't do this here because it takes too long to provision on Amazon for this keynote but yes, like knowledge creates okay so all right, it already exists I screwed up the name choice it doesn't matter because I actually created one before this presentation but you get the idea so 10 minutes later ta-da, I have a swarm and it's working so then what happens if I go here so the key thing to understand is if I use this optional Docker cloud feature what happens is all of my swarms call home to a service we maintain called Docker cloud which lets me log in and see all of them and manage them so it's connected to a team management so I can say you have access to this swarm, et cetera and so the result is once it's up and running back on my Mac I can say okay my swarm, this is my Berlin demo swarm here that I created just before the presentation and then what it's gonna do is connect to it establish a secure tunnel yada yada yada and then boom I have a shell and here I have access to my swarm remotely so if I list the nodes here this is never readable, one second so here you can see 10 nodes so that these are 10 EC2 instances with their own IPs in a swarm ready to deploy stuff so from here let's run the same application so Docker service creates actually I think I did it before, there you go so same application as I run locally I'm gonna run it here and now you can see my service scaling 10 replicas now if I go back to my Amazon my Amazon, I'm not actually a certified Amazon admin that's the problem so many services, okay okay this is the one so if I go here in my very familiar very simple interface I can see the address of the ELB load balancer that was configured for me so native Amazon load balancing all set up for me and if I go to that address, that doesn't work of course oh, is it 8080? damn, it's not working oh, wrong region, Frankfurt, okay how much time do I have, okay we're good should I keep going, I'm almost done ah, it's supposed to expose oh I did 8080 to 8080, okay okay let's, oh I did it the other way around, okay it's 80 to 8080, expose port 80 okay, anyway so okay, cool so we can switch back to the slides so you get the idea I went from zero to Docker on the Mac, Docker on Amazon and it's all hooked up and I can start developing and going to production so it's really simple it's not new features everything you show the commands they all existed before but actually solved a really significant problem for our users so now to connect that to what we were saying about open source and solving hard technical problems this is how we explain this feature to our users you have Amazon, you have Docker this is the architecture diagram and it works great together, the end now in reality if you zoom in it looks more like this in other words a really complicated set of technical components and this is a simplified view actually so Amazon itself is a lot of different services of course and we integrate with a lot of them and then on top of that you need our container engine configured in swarm mode so a way to scale, discover each other, et cetera you need a Linux host because EC2 needs an OS to boot so that needs to be integrated together you need custom plugins for Amazon storage Amazon networking, Amazon admin, et cetera and then you need infrastructure management and it turns out that piece was particularly hard because what I showed you is actually the easy part deploying to Amazon, click, click, amazing I have 10 nodes the problem is that's the first day and then you have months and months and years of operations and you need to keep your infrastructure actually working through upgrades and configuration changes and topology changes and failures and all of that stuff so you need a component that can keep the infrastructure in the desired state in a way that's both deeply integrated with in this case Amazon but also portable so that we can do the same thing on Microsoft Azure, Google Cloud, IBM, BlueMix, et cetera and that's actually hard you need to be able to do it in a declarative way really what you want is a self-healing infrastructure it just kind of works and you don't have to wake up at four in the morning to reboot individual instances so it turns out those requirements really added up to something really hard for us to do and so we ended up having to implement a custom component that could fit in this demo that you saw so we had to draw pretty deep in our well of engineering talent we actually had to acquire a company to solve this problem so if some of you might remember earlier this year we acquired a company called Conductant it's a small team of some of the best systems and operations experts in the world they help scale production deployments at Twitter and Zynga and Google and places like that and for those of you who know Apache Aurora it's a cluster management system that's in production on tens of thousands of nodes in places like Twitter and Apple they created it so this is some serious engineering effort that we invested and the result is a component that helped us solve the problem that we talked about and following the model I described we decided okay we should open source this and so we actually haven't yet but we thought now would be a good time to do it so we can talk about a new component to illustrate our example so right now we're introducing InfraKit which is exactly what it sounds like it's a toolkit to create and manage infrastructure that's scalable self-healing declarative and it embeds years and years of experience operating real systems at really large scale so it's really cool you should check it out don't check it out quite yet because it's not open source right now we actually thought since we're at open source conference we could open source something live on stage so at the end of this talk and actually running out of time do you wanna do some live open sourcing? Okay. So let's go back to let's go back to the computer so if you go to github.com slash Docker slash InfraKit oh I have to sign in and again it was down earlier when I was rehearsing so but it looks like it's back up oh okay make sure to enable two factor authentication it's really important okay so you can go check it out and so the readme is really complete there's a lot of good, wow there's a diagram now examples it's a really neat component if you're into that kind of thing there's some really cool systems you can build okay so make public yes InfraKit I understand okay it's open source cool cool cool so okay so I'm out of time so if we can go back to the slides that's all I have so if you're interested in InfraKit or any of the other components that we talked about all the plumbing there's a session later today at 12.15 it's called I think deploying containers without Docker or something like that it's so it's actually Docker engineers and some of the core maintainers of these components showing you how to assemble these features without having to use the whole Docker platform just using the plumbing and then if you're interested in Docker for Mac and Docker for Amazon or you wanna you wanna suggest another edition just you know check out the website and yeah that's it come talk to me anytime I love this stuff I'd love to talk about it and enjoy the conference hold on one second hold on one second oh Solomon come on back up come on back up come on back up I wanna quick ask you a question so I love the way you described you know how you're taking products and then you just open sourced one which was super cool and then you know those start to go into that feedback loop so you know it was perfect because you know I hadn't thought of it where you'd start instead of starting at the project phase or the product and you open certain projects I would love to start at the profit phase myself personally and then go back oh yeah so in a few years hopefully we can start with our revenue and there you go exactly Google's got that one nailed by that yeah yeah but we're getting great great talk and for all of you who have ever done something like this normally developers are not working in front of 2,000 people live on stage so I thought you nailed it thank you well done thanks all right