 Hey, good morning. Mark Twain said the coldest winter I ever saw was in the summer of San Francisco. And this is probably a unique conference where they're giving out hand warmers. I've been to all possible countries, like 45-plus countries around the world, coldest temperature, never seen a hand warmer, but in San Francisco. So Mark Twain is not dead. Long live Mark Twain. Anyway, open source. My name is Arun Gupta. I work for Amazon. I'm a principal open source technologist. I work with a lot of service teams across Amazon, helping them define and build their open source strategy. Open source embodies knowledge and experience that includes diversity and community. It allows you to spread the idea of farther and faster than anybody could imagine. How does that relate to banyan trees? What does that mean around banyan trees? Let's talk about that. My dad is 83 years old. He lives with us. Very fortunate for that part of it. And he always has told me that when you grow up, you don't want to grow up as a coconut palm tree, but you want to grow up like a banyan tree. Let's talk about that for a second. Coconut palm trees, while typically reminds us of vacation, warm weather, nice place, have a deeply rooted system, very resilient. But when you sit underneath, does not provide shade to anybody because these are tall trees. As a matter of fact, if a coconut palm drops on your head, it can cause you to see this injury. So my dad correlates that to us being rude, not nice. So move on. The picture over here is what you see is of a mighty wax palm tree, which is in Kokuru Valley in Columbia. These are some of the tallest palm trees. One that you're seeing over here is about 200 feet tall. As a comparison, it's about 25 floor building. Banyan trees on the other side grows in the crevice of a host. They have prop roots that starts from the stem, and they grows down toward the ground and gets grounded at multiple places. It provides shade in the hot summer weather of India. And you won't be surprised if you see them near a temple. They have a lot of medicinal values as well. And that my dad correlates to always as being humble, that you have so many values. So the picture over here is the largest banyan tree in the world, which is in Kolkata, eastern part of India. And this one particularly is about 250 years old, spread over five acres, which is a Manhattan city block. So these trees can go really big. So growing old is inevitable for everybody. Growing up is optional. So the life lesson here is, as you grow up, you want to grow up like a banyan tree, provide shade to people around you. And that's sort of the life lesson over here. Now, I've been wondering about this. I mean, I've known this lesson forever, actually. But then for a few months, I've been thinking about it. That's more like open source as well. So let's talk about it, how banyan trees and open source have a couple of common traits. Now, as we discussed, banyan trees provide shade to people in the hot summer weather of India. If you think about open source, anything and everything that you want to think about in any particular code is available in open source. Think about databases. It's an open source. Think about containers. It's an open source. Think about operating systems. There is something in the open source that is available. And if it's not available, then that is something that you can build upon and then enhance the value of it. So it really has a broad set of capabilities that is available that you can build upon very easily. Let's talk about another aspect here. In banyan trees, particularly, say, frugivore words, they pick up the seed, they drop the seeds all over the place, and these seeds then become the prop roots, which really grow down towards the ground. And it allows a tree to be grounded at multiple places. And because of these multiple grounded areas, it also gives you a rich set of nutrients that allows the tree to grow further. So that's the property of the banyan tree. Now, how does it correlate with open source, for example? Now, we adopt at Amazon, it's important that when we are building, we are building using open source. We always look at it that way. Open source gives you that ability to pick up something and build on top of that. It really allows that unconstrained idea propagation across the board, essentially. The way how software is being built these days, that is the prominent way. Take an example of Git, the nature of Git, that this is a distributed code system. The ability to create a feature branch, the ability to send a pull request, that really allows that unconstrained idea propagation where anybody can send a pull request. And I saw the numbers that Sid was talking about, really impressive statistics on how 35 new contributors are joining the project every day, how 2200 plus pull requests have been sent. And this is what allow to change the world one commit at a time, as a matter of fact, if you think about it. So in addition, there are open source licenses that provides resilience for that idea propagation. It allows an individual and organizations to propagate the idea the way they feel right without coordinating with the third central party. Because this is open source, you're allowed to modify, you're allowed to update, and you're allowed to make use of it. So, I mean, and that's the whole essence of, you know, how prop routes allow us to be grounded to be multiple places and how exactly open source allows that as well. Now, at Amazon, we look at builders. And open source really enables that. Because in open source, you know, you bring people in there, they're allowed to build, and it really brings that two-way equation. And we always look at the balance and a pragmatic approach towards open source, essentially. So we all, you know, everybody knows Amazon's leadership principle is customer obsession. And at Amazon, we always look at it, what is that the customer needs? Where is it that the customer needs help? We don't lead with what the coolest technology is. We look at solving customer pain points, and that is the technology that we start contributing towards, or we build a service around. Our customers love open source, and they ask us to help them innovate faster. So oftentimes, we will look at projects like Linux and Kubernetes, and we'll start contributing towards them. There are times when we create our own projects, and I'll give an example of that in a bit, Firecracker, which we contribute to the world. Or there are times where we contribute to projects that our customers have created like Netflix Spinnaker. And that's where we are contributing actively, building our AWS integrations. So there are different ways by which we contribute to these open source projects and further the innovation. Now, healthy communities are very important for the longevity of our OSS community, and we constantly seek different ways by which we can invest in the community, whether it's through sponsoring conferences, whether it's by having a booth, whether it's giving credits, and I'll talk about a couple of those items a little bit later in the presentation. A lot of the open source contributions that comes out of Amazon team, they come out because we want to reduce the technical debt. We don't want to have the maintenance burden. We want to have the upstream compatibility, and that's sort of the essence that we look at it on how we want to maintain that upstream compatibility. And last but not the least, one of the biggest benefits of open source is of course a lot more eyeballs on the source code. So for example, we release S2N, which is our Apache TLS implementation, and it gave us a lot more eyeballs and a much better quality. So really, leveraging that banyan tree of open source communities and growing it better and better. Let me take a specific example of an open source project or open source community and banyan tree analogy. At Amazon, security is job zero, followed very closely by operational efficiency. We are constantly seeking ways on how we can improve the operational efficiency of our services, and as we improve those efficiency, it reduces our cost, and then we pass on those cost savings to our customers. As a matter of fact, since 2006, we have reduced our prices 72 times, and none of the customers have complained, which is a good thing. Now, AWS Lambda and AWS Fargate are two of our compute services, and these are extremely popular services. AWS Lambda gives you where you bring your own function in a language of your choice, and we run it at scale. And similarly, AWS Fargate does that for a container. Now, AWS Lambda does trillions of executions every month. AWS Fargate runs tens of millions of containers every week. If we can achieve operational efficiency at that scale, that brings our price significantly down in terms of running the infrastructure. So we started looking at virtualization-based security, and that could be implemented in a language with safety properties, okay? So we had this problem. We looked out what is existing. Can we stand on the shoulder of others? Can we look at the tree that already exists? Can we build on something, or do we need to start creating something from scratch? So we looked at REST as a programming language that is known for system development. By the way, it's one of the most loved developer language by Stack Overflow. It is the most loved developer language. So by using REST, we can remove a set of vulnerability issues already, because that's how the language is designed. Then we looked at cross-VM, which is an open-source VMM that was built by Google to run Android applications in Chrome OS. And it turns out cross-VM and REST was written in REST. And so that proved a point that VMMs can be written in REST, and they are successful. So we took cross-VM and REST as sort of the seeds, and we created Firecracker out of it. But the use cases for cross-VM and Firecracker were different, so we quickly diverged away and started building the Firecracker code base. Now, Firecracker today provides startup time and density of containers, and virtualization-based security. And this is how we achieve operational efficiency for AWS Lambda in Firegrade. And this is really an open-source virtual machine manager at this point of time. Now, there is quite an active and diverse community around Firecracker. They want to include all kinds of different features, but Firecracker has this core set of tenets, and yet we want to make sure we grow that virtualization-based community. Now, how can we make it better and better? So we worked with Intel, Red Hat, Alibaba, and our other partners, created this community called as REST VMM. As part of REST VMM, this community has a bunch of components, common virtualization components. These are called as REST crates, and those crates are available so that the customers who want to build a specific VMM, they don't have to rewrite a lot of the common components, and they can leverage these and build on top of that. So that's the whole value proposition here. Now, Cloud Hypervisor, for example, is one of the open-source VMM that has been created using these REST VMM components. So we are already seeing that community growing, essentially. Let's take a look at another example. Cata Containers is an open-stack foundation project which was already doing virtualization-based security, but they were using QMU as their VMM. Now, one of their long-standing requests have been that we need a little bit lighter-waste VMM, essentially, so that it can achieve operational efficiency for them as well. So with the launch of Firecracker, we quickly collaborated with Cata Containers, and they provided integration to Firecracker, essentially, and now you can run Cata Containers using Firecracker in Kubernetes as a matter of fact. Similarly, it's all about that idea propagation and inspiring other communities as well. QMU, for example, they added a micro-VM machine type. Now, OSV and Unique are two communities that have always been existing. They basically compile your applications and compile them to Unikernels and run them lot more efficiently. Firecracker just added a bit more oxygen to the idea that it is possible and achieve a better operational efficiency by using that micro-VM type that was provided. Now, if you think about a bit of a renaissance that has happened over the years, essentially, virtualization containers, functions, you know, we've been very excited about this, and that's the real joy of open source, essentially. Now, VWorks have built a tool called as Ignite on top of Firecracker. When we built Firecracker, our original thought process was to leverage and increase operational efficiency, but then VWorks took it to the next level where they say we're going to achieve the operational efficiency for the end users as well. Using Firecracker, they can create a VM of cloud anywhere. They're using Kubernetes for orchestration. They're using GitOps for your source code management, and then they provide support for cloud-native tools and APIs. So this is sort of the way we look at it, you know, how that banyan tree around virtualization-based security is growing. Now, our interest certainly is to grow that community, and Firecracker is just one piece of it, and we are very excited to work with a lot of our partners on this. Let me give you an example of that explains how building and testing open source at scale is equally important, and how that builds confidence in customers to utilize open source. At Amazon, we are a big Java shop. We run tens of thousands of microservices using Java, and we realize the need that we need to build, create our own builds of OpenJDK. OpenJDK is a reference implementation of Java, so we started having the internal mirror of OpenJDK, building the code, running it at scale. And the important part is anytime you go to amazon.com or you're accessing a blob in Amazon S3, that is actually backed by OpenJDK build, and then we realize, hey, by the way, our external customer needs and internal customer needs are quite similar, because external customers are asking for a similar release as well. So all we had to do was basically added some more flavors to it, that TCK certification is important for them, they need a long term support, and a wide variety platform support. But I think the key part that I wanna talk about over here is, we built a package for builders that really embodies the build and test automation of open source at scale. And this is what builds confidence in customer. This is much more than TCK certification, which is just a compliance thing, as opposed to that, hey, this is actually working at scale. So if the moment you're going to amazon.com and there is a backing JDK endpoint running, I feel a lot more confident over here. Let me talk about a project that embodies open source methodology around machine learning on Kubernetes. Kubeflow is a project that was open sourced at a KubeCon around two years ago, 2017. And the way it started was, hey, there's Jupyter Hub that data scientists use for doing training and inference. There is TF job, or TensorFlow job, TensorFlow is a framework by which you do machine learning. It's a very popular framework. So TF job is how you do training, and then there is TF serving, which is how you do inference. But a lot of customers are working on it together, are working on it on their own silos, so there is an opportunity to collaborate over here. So then, this is where several open source companies came together and they created this project called Kubeflow, and this was launched in December, 2017. Now, as that crevice was formed, as that host, the tree was formed, along the journey of the open source project, several routes were planted. Oh, we need hyperparameter tuning, create cut-up. Oh, we need a Python SDK that does end-to-end deployment, training, inference using a Python SDK, and fairing was born. Oh, we need workflows, ML workflows. So then, Kubeflow pipeline was born. So Kubeflow is again a very classic example on how a banyan tree is grown. It starts with a simple need that we need to collaborate, and then the needs are being formed. And right now, we are very excited. I am actively engaged in the Kubeflow community. A lot of customers run Kubeflow on top of our Kubernetes service, and we are looking forward to the one-dot orderlies coming in the next few months. Now, all that is good. So I talked about Kubeflow particularly. One of the common customer asks for us is when they're running ML workloads on Kubernetes, for example, they're always asking us that my data scientists are using Jupyter Notebook, but I have a DevOps team, which is sort of the engineering team that takes it into production. How does that story align? How does that work? So that is called as MLOps, essentially. So I'm gonna show you a demo. It's basically a recording that's gonna start in the next screen. But essentially, what we're using is an open source stack. So essentially, this is Kubeflow installed on Amazon EKS. Amazon EKS is Amazon's Kubernetes implementation, which is upstream compliant. And then, of course, we are using GitLab for showing that entire MLOps story. It's a basic demo here, but you're gonna get a feeling here. So what we are showing you here is a Git repo where we have an mnist.py, which is what we are showing right now. In that mnist.py, this is basically the training code, pretty much taken as it is from upstream, made some changes so that arguments look great. And then we have an inference client. So one for training and another for inference. So we got two different Python files over there, and then we make sure that the files exist. Once we have the Python source code available, we, of course, need to create the Kubernetes manifest. So then I have mnist.training.yaml, and this is a Kubernetes manifest, and I got regions and access key IDs and all of these credentials specified in the file as environment variables could be done as Kubernetes secrets as well. But what you have is basically mnist.training.yaml here. And then the next thing I'm gonna have is a mnist inference.yaml. So he's talking about how those environment variables are specified in the file itself. So then we go back and let's take a look at my, not the inference client, but the inference.yaml.inference.yaml, actually. So this is my yaml because training is a one-time thing, so I create a Kubernetes part or a job, and the model is created and stored into S3, and now I have an inference.yaml, which has a service and a pod and all the Kubernetes manifest that needs to be generated, and then I have a simple Docker file, which is just inheriting from TensorFlow, and I got a main.py and a mnist.py. The main.py is how we GitLabify the entire sample. That's sort of my entry point, and what you're seeing is now main.py essentially, and this is basically taking all the environment variables and running it in the GitLab environment very seamlessly. And we of course have our GitLab.ci.yaml where my different stages are defined. So I have a simple validate stage, and you can see with the dot before that that this is commented, but this is how we basically tested the environment, that okay, I can run python mnist.py, that will generate my model, or I wanna train on the runner in which case the model is generated in my local environment. Again, being commented because this is not running in production essentially. And then I got my other usual stages. So build, and in this case, I'm building my model, and the model is being exported to a container registry. So this is my container registry that is shown on the screen. And that's where the model, that's where essentially my Docker image is being sitting using mnist. So now as part of train on Kubernetes, I'm gonna take the yaml file, and I'm gonna run the yaml file. Of course I need to install kubectl before that, and then I fire up the training job over there. And once the training job is done, essentially you have your model stored in a S3 bucket. And the inference is then gonna take the same model that is stored in S3 bucket and run the inference on it. So this is a separate stage. And again, the last stage is really the deploy where I'm saying, okay, inference done, and now I'm gonna run a client here, okay? The last stage is the test stage where I'm doing a kubectl port forward, which says, okay, my inference pod is running in the Kubernetes cluster, do the port forward, and I'm just gonna do a simple test around that. And that allows me to run the test. And then of course I have my usual cleanup stages and in the cleanup stages I remove the model, I remove the pod so that I can run it again and again multiple times if I need to. So this is all available on the GitHub repo and we'll be happy to share a link about this with you afterwards. So cleanup training, inference, and cleanup S3. So all of those cleanup tasks are done, essentially. And the last step is to of course remove the S3 bucket as well. And so what you're seeing is the GitLab interface now. I say build, which basically builds my Docker image and you're seeing the output over here. And the last step where you see is images pushed to my container registry. Then I say train on Kubernetes and as part of train on Kubernetes, my jobs are created. And you can see the last line says, MNIST training created. So that's basically my pod is being created and behind the scene is gonna create the model and save the model on my S3 bucket. Finally I say deploy and as part of deploy and I'm gonna say, okay, run my inference pod, which I can then refer. So the last line again shows MNIST inference running. So my inference pod is running because now I can do the inference as part of it later. And all of this is running behind the scene on an Amazon EKS cluster. So you can scale it any way you like. And the last step here is of course test. And as part of test, essentially what I'm doing is I'm just picking a random image from the test database. Not a right use case but picking a random image and saying, okay, hey, this is what it is. So in this case, for example, it says, the model thought this was a sandal and it was actually a sandal class five. So all we're using is an MNIST database to show you exactly how training and inference can be done end to end on a GitLab interface. And this is ML Ops. And by the way, this is a basic start only. This is only a launch line. This is where we wanna hear your feedback. How can we make it better? This is where we were thinking about, hey, data scientists don't operate on the Python level. They operate in a Jupyter notebook. How can I bring that interface from Jupyter notebook end to end seamless integration using GitLab? Or whatever your tools are. Tools should not get in my way. Tools should get out of my way. One of the most common questions that customers ask us all the time is, how does Amazon deliver software? How does Amazon do things? How do you build for scalability, high availability, tolerance? And how do you rapidly deliver software at scale? So at Reinvent, we launched this thing called the Amazon Builders Library. This is really an open resource for anybody. And it gives a look under how Amazon builds software that underpins Amazon.com and AWS. These library is a bunch of articles that have been written by practitioners who are building software failing and then recovering and building resilience in the software on a regular basis. It's a very deep dive article that is, you'll be surprised in how many ways Pac-Man can fail if you're running in a distributed system. It's a very interesting article. I would highly recommend reading it. It's just a start and there is a lot more to go over here. Now, we would like you to grow your banyan trees. Build your banyan trees and plant the seeds wherever you want to. To that effect, we launched a program last year, late last year actually, and we want to give credits to any open source project that want to use AWS as an infrastructure. So, highly recommend, scan this QR code, just go fill up a form, tell us where you want to grow your banyan tree, how you want to grow your banyan tree, and how you're going to use AWS for the infrastructure. Very simple requirements. We just want an OSI approved license, preferably a nonprofit or a foundation. With that, I want to leave you with a list of references. You want to learn anything about open source at Amazon. This is opensource.amazon.com. Talk to us on social at AWS open, and if you want to either contribute or learn more about our contributions, we have a blog post as well. Thank you so much.