 I'm from Visual Studio Team Services and Team Foundation Server. That's our end-to-end DevOps software from Microsoft. We do everything, we support any language, and we can deploy to anything, not just Azure. So we have everything from Agile planning through to automated build, automated testing, and deploying. Some of the cool stuff about our product in particular, we love to talk about how we use Visual Studio Team Services to build VSTs and TFS. So we have our very own 700 engineers across 40 feature teams, working on topic branches off master, and pushing hundreds of commits every day in our product. On top of that, we're also responsible for the DevOps and Agile transformations that we do across Microsoft. So not only does our own team use our product, we also have the 75,000 engineers at Microsoft using VSTs to build faster, to ship faster. So we absolutely love talking about that. So totally come by our booth, ask us about our CI CD pipelines, people, processes, projects, we love talking about it. We also have these awesome t-shirts that you should all come pick up. They're Game of Thrones themed if you are a fan. Thank you. Okay. We're really thankful to all of our sponsors. So thank you, Microsoft. We could not do this event without the sponsors that we have. All right. We also have Jimmy Guerrero, the VP of Marketing from Signify. So Jimmy, you want to come out here? All right. So again, thank you so much for sponsoring. Good round of applause. Thank you. Good morning. So it signified what we do is that we layer artificial intelligence and machine learning on top of your existing DevOps tools to reduce alert noise, but to also perform correlations that, for example, when you get an alert, be able to surface all the relevant data related to that alert regardless of the data silo that it lives in. The other benefit of these correlations that are happening in real time and historically is that any solutions that you've implemented in the past are recommended to problems that you may be experiencing today. The final benefit of these correlations is that you're able to get predictive insights, which means that recommendations that you can see to fixes that you can implement today, that are going to fix problems that you see trending into the future. So if you've already made investments in things like GitHub or Jenkins, Chef, Puppet, Ansible on the post-production side, Splunk, Logly, AppDynamics, New Relic, Datadog, et cetera, and things like PagerDuty for incident management, signifies going to help you use all those tools to get to accurate answers faster. Because many of you probably experienced that we're shipping more code faster with less defects, but our uptime KPIs haven't actually moved. Swim by the booth and we can talk about how we can get you to those accurate answers faster. Thanks. Thank you, Jim. Thank you, Signify. And yeah, absolutely. Please go visit our sponsors. There's going to be like ample breaks and they have very interesting products to show you. There's lots of cool swag out there. One thing that I did want to mention is there's actually going to be a drawing at the end of the second day. We're going to do that at like right before the event closes out. So there's some really cool prizes that you can get in a drawing, which is not a raffle because a raffle is legally regulated in Boston and I cannot call it that, but it's sort of like a raffle. All right, so our next speaker, Tobias Macy, is going to talk about open sourcing your infrastructure and I was super excited when he came through with this presentation idea because it's actually something I'm pretty passionate about, like both open source and infrastructure code and seeing the two together is great. So Tobias is a systems architect with nearly a decade of experience designing and building distributed production grade web data and sensor platforms. He specializes in designing and building complex comprehensive automation for powering scalable and maintainable systems. His current adventure is helping MIT redefine global education as the manager and lead engineer for the technical operations team at the office of open learning. He also works as a consultant helping growing businesses evolve their initial infrastructure. In addition to his work, he also hosts podcast.init which is a Python podcast about how Python and the people who make it great. He is one of those people, so I would really appreciate if you gave him a big round of applause. No worries. Hi everybody, so as James said, I'm gonna be talking about open sourcing your infrastructure, why you should do it, why I did it and some of the challenges and reasons behind all of that. So you're wondering who is this guy? Oh, I'm Tobias Macy. I'm the DevOps manager at the MIT Office of Digital Learning, recently rebranded as the Office of Open Learning. It's hard to keep up, so I'll just stick with the Office of Digital Learning. I also host podcast.init. I'm a system architecture consultant by night. These are all the ways you can get in touch with me. And what I'm talking about is there are a few different ways to think about open source. So when everybody thinks about open source, you're probably thinking Linux, Ruby, Python, Caosay. Well, I'm not talking about those. I'm talking about small open. So as you can see up there at the top, we've got the Saltops repo, which is where a lot of the infrastructure code for my particular department goes. Right below that, you've got Chef. So you can see there's a huge disparity in the overall popularity, but I'd like to argue that they're both equally critical to our overall ethos of DevOps and infrastructure. So why do you care? Why do you want to open source your infrastructure code? Well, I think that for one of the reasons is that it's a lot easier to get help when you're stuck with that bug and you're saying, how on earth am I ever supposed to solve this? Because rather than having to bang out 15 pages of documentation and try and make sure that you redact all the pieces that you don't want people knowing about, you can just send a link and say, okay, I've got this problem. This is where it is or at least the general idea. Maybe here's a diagram of the stuff I've been working on and then you can send that out and people can actually see exact concrete evidence of where that problem might be coming up as opposed to I've got a sort of 15 line snippet of my code and the issue is actually on the 16th line but I can't show that to you because that's proprietary. Also, as an individual, it improves your visibility across the world, across your sort of peers and who doesn't like working with open source? As a team, it encourages loose coupling because if you know that everybody in the world is gonna be able to look at everything that you write, you wanna make sure that it's architected well, that you can actually make smaller parts of it usable for other people as opposed to giving people the end of the string that they then have to unravel and try and figure out, okay, how do I break this one piece that I care about away from the rest of this system? It makes deployment easier because you don't have to say, okay, did I remember to put the SSH key for that GitHub user on this server so that it can clone my infrastructure code? You can just say, nope, it's open source. Here's the URL, pull it. Makes a fighting with that a lot easier and it encourages better code hygiene because you're less likely to put in those AMI keys or the, or IAM keys rather, or the SSH key that you care about because you say, right before you push, you're thinking, do I really wanna add that in? Do I really want everybody to be able to see this and what'll happen if I do? So maybe I should put this in a different location or abstract it away, maybe put it in vault. And why does your boss care? Well, it fosters community goodwill because I'm sure most of you have been out there on the job market looking at companies and thinking, why do I wanna work for these people? What are they gonna offer me? And one of the things personally that I enjoy seeing is a strong commitment to open source where you can say, I know that if I go to this place, I'm gonna be able to work on cool things and I'm gonna be able to let other people see it rather than I'm gonna be working on cool things but my NDA says that I can't ever tell anybody about it except in the most abstract terms. It also improves your recruitment potential for that same reason because people outside of the organization will be able to see, oh, they are working on cool things, they are using tools that I'm interested in or maybe they're using tools that I'm not interested in but I can help evolve them to a cleaner and better state. It also helps with employee onboarding because rather than having to start first day, say, okay, here's everything that we're working with, here's everything that you have to learn in the next two days. You can give people an easier on ramp because from the very first time they start talking to you and engaging with you as an organization, they can see, oh, this is what I'm gonna be working on, this is a problem, maybe I can have a fix for it ready to go by the time I get in the door. So where is this talk coming from? As I said, I personally enjoy open source, I like being able to work on it and when I started at MIT, I was faced with a pile of spaghetti code and a choice. I could either try and unravel that spaghetti or I could start fresh and start open and so I made my argument of why I thought there was the better option and we proceeded to go with that and so over the past year and a half, everything that we've written has been open source aside from our passwords, you can't find those on GitHub and unfortunately there is one piece that we don't have open yet because of the fact that it has all of those binary GPG blobs in there and so in a commitment to open, one thing that we've been doing is rearchitecting our system where the majority of that rearchitecting is being done in the open so that we can remove all those secrets, put them into Hashicorp vault add that layer of abstraction I was talking about and then open source the rest of our configuration code so stay tuned for that part. So when should you get started? Well yesterday is good, now is better but as soon as possible because the longer you keep all of your infrastructure code the more likely you are to put those IAM keys where they don't belong or add another helping of spaghetti on the plate and so the sooner you start thinking about how can I open source this? What pieces can I open source? The more likely you are to engender that culture and ideal of decoupling your code making sure it's easier to maintain, easier to distribute, easier for other people to understand what you're working on because if somebody outside your company can understand what's happening chances are people inside your company can as well so it reduces complexity, reduces issues with misunderstandings between teams or what you're working on. How can you get started? Well, you can open source your next module for your infrastructure so you've got that piece of code that you wrote that plugs in Hashicorp Vault to your configuration management platform. Well, you can open source that because everybody wants to be able to use Vault and they want to be able to do it in an automated fashion. Side note, I did that. If you want, there's a link later on if you want to take a look at the way we did that. You can open source a script that you wrote that makes it easier to just get your configuration management bootstrapped on your IAM nodes or open source the configuration for your Packer build because people want to know, how does Packer even work? You can open source the inner workings of your chatbot because it's the pieces that touch your systems that you don't want people to be able to see, but the guts are gonna be the same for everybody and chances are somebody else can take advantage of it. You can open source your system diagrams. Again, easier onboarding, easier to discuss the problems that you have and chances are somebody looking at your overall system architecture isn't then gonna be able to say, oh, that's the part I'm going to go and compromise, hopefully. You can open source your runbooks. So if you get woken up at three o'clock in the morning for RabbitMQ having a network partition, chances are other people have that same problem. So why don't you write up how you solve that problem, put it on GitHub, put it on Bitbucket, wherever you want to make it public, put it on a blog post and say, when I have this problem, this is the runbook I go by because again, the more that you make open, the easier it is for people to learn and that's what DevOps is about, right? Learning from each other. So now what? You can check out the code that I've written. You can go and open source something yourself and so one of the other things I wanted to talk about here was that as people who are learning continuously for our jobs, we say, okay, I wanna go use this tool, how do I do it? So you go, you Google around, you come across a blog post and you say, great, I'm gonna know exactly what to do by the time I'm done reading this and all you get to is hello world. You say, okay, now what? And then you say, well, I'm gonna figure it out and then I'm gonna write that intermediate blog post but work gets in the way, life gets in the way. I'm sure we've all thought it, I've thought it. I say, okay, I'm gonna write that post but then I never do. So this is another way to fill that gap of knowledge is by open sourcing the actual production grade code that you're writing to scale and manage your systems. You're opening the door for people to get past hello world and broaden that on-ramp for anybody else who's trying to get started in this industry or get started with that tool. So I just went really fast through all of that. So I guess why don't I take some questions so that I can broaden on some of these topics. So this is good for like, if I have individual scripts or say puppet models or things like that but what about overall architecture patterns? Have you seen any examples where someone like me could go and say just find a basic building block of how someone might build a piece of infrastructure for a particular application? Yeah, so that's actually a really great example of where having your infrastructure code open source would be useful because you can say, okay, for taking as an example something that I work with most of the time is a Django application. It seems simple, you just say, okay, it's just some Python code, but it's also a database. It's also Redis. It's also Nginx and Uwizgi to make sure it's all running. It's also my monitoring and my load balancing and how do I get those servers provisioned? So each one of those individual pieces is something that you can open source and so you can say, okay, I've got a formula for deploying a Uwizgi application. I've got a formula for installing Python and some Python code. I've got a formula for provisioning an RDS instance and then what you can do is you can write the glue code that says, okay, now I'm going to orchestrate deployment of all of those things. So you say, first I'm gonna bring up my database, make sure that that's healthy. Now I'm gonna bring up my application servers and make sure that they can connect to my database and to Redis, make sure that they're listening. I'm going to bring up a load balancer so that I can scale horizontally and you can just work piece by piece. If you start at the single instance level of here's how I deploy RDS and then you write all that glue code and open source that, that glue code right there is a perfect overview because even if you don't necessarily have the pretty diagram that says this is how everything fits together, for somebody who is motivated and trying to solve that same problem, they can look at that glue code, they can figure it out and chances are you'll have references in that code to all of the individual modules if they need to dig deeper on a particular piece of that system. You talked a lot about how you should not have your IAM keys in there, things like that. How do you deal with the situation where somebody new joins your team and they miss the memo about the IAM keys and oops, they committed? I was just speaking from experience this situation where we had something open source and we had all our keys in a separate place and then somebody basically was new and didn't know the policy and then that basically forced us to close some things. Should you just speak to that concern? Yeah, so that's one of the things where the overall practice of open source and code hygiene comes into play is that you want to make sure that you do have some guardrails for new people. You say, okay, everything that goes into production needs to have code review, you need to test it in a sample system because those are all checks where you can catch that mistake. If that mistake does make it all the way out to production, all the way out to GitHub, then yeah, you revoke the key and hopefully you notice it before any compromises happen. Also making sure that you do have good documentation of just these are our best practices. This is where our secrets go and actually having a system for secrets management which is one of the more complex things in our industry but there are available ways to deal with it. So yeah, basically just make sure that you have those checks and balances. Make sure that you have decent onboarding so that when somebody does first start, you have somebody there to help them along and figure out what are all the pieces that they need to be thinking about. Did I answer your question sufficiently? Yeah, if your code review is done on GitHub, as soon as someone uploads a branch, the key's already gone. Right, and so that's why you want to make sure that you do have those guardrails of you can only create a key that's scoped. Also, one of the great things about having all your infrastructure open sourced is that because we can learn from each other, you can see patterns of, okay, this organization does a really good job of managing their scope of their credentials. This organization does a really good job of automating checks. You can have pre-commit hooks to do checks to see does this commit contain anything that I don't want to be made public. And by being able to, as the keynote presenter say, have the pioneers with infrastructure code that you can look at to see what are the ideas they put in their MVP and then the settlers can pull in those ideas and add that to their infrastructure code and make it their own so that eventually you get to the town planner stage where some organization or group of organizations has settled on best practices, created a level playing field so that those pioneers then have something new to build off of and innovate on. In the vein of talking about security with your infrastructure, do you have any concerns about opening up your infrastructure code if for instance an attacking party were to review it and be able to find weaknesses in it? I'm not saying that you necessarily have IAM keys checked in, but you might be able to discern that maybe you have a hash that's an MD5 and you didn't realize that. Maybe you have an unencrypted link between two systems and you didn't realize that. Do you have any concerns about that? I think everybody who works in operations has concerns about that, but that goes to the point of what I was saying of when you know that your code is open, there's a better likelihood that you're actually going to consider, is this something that I really want to make public before you make it public? And it encourages you to architect your systems in a way that you don't have issues of people being able to see what is the overall topology of your network because you make sure that that's hardened and by open sourcing those examples of hardened network topologies, let's other people learn from that so that the overall problem of vulnerable systems becomes easier to manage and easier to deal with because there are concrete production examples rather than the hello world of this is how you create an IAM key and this is how you don't commit it to open source. So I'm wondering what license did you go with and did you have to get that cleared by legal? Was it a huge pain? Yeah, that's actually a very good point. So everything that I've open sourced, where? There, it's all BSD3 and fortunately I was in the enviable position of having already had that question answered ahead of time because MIT has a legal department where they say these are the open source licenses that you are allowed to use and from those I picked BSD3 clause because I feel that it's a great, it's a very open license. It says anybody who wants to can do whatever they want with this code just if you change it, make sure that you don't say that I'm the one that did it originally because I don't wanna take responsibility for what you did. That being said, open source licensing is a whole big topic that lots of people have opinions about. Yeah, I like BSD3 clause. There are definitely other great licenses out there. Apache 2 is another common one because of the sort of patent clauses built into it. I can speak louder. How do you deal with the IP issues in general? So I am in the good position of having the IP that I'm working on being intentionally open. So because I work at MIT in the department where we are trying to improve education globally, all the platforms that I'm working on are open source. So I run edx.org for residential students. We also feedback into the open edX platform with fixes and improvements. We write a lot of applications that tie into the edX platform and all of those are open sourced. So there's no real IP problems there but also unless your entire business is providing infrastructure or providing very low level tooling, then there's a good chance that there isn't any actual intellectual property that you need to be worrying about as an operations engineer. That's usually contained at the application level where if you're working in pharmaceuticals you wanna make sure that people don't have access to the diagnostic code that is used to determine whether or not your drugs are fit for human consumption or if you're working in, coming up link with examples, unless you're working in a company where your entire IP is your infrastructure, then most of the time as an operations engineer that's not really a concern. If it is a concern, there's a good chance you're not gonna be open sourcing your infrastructure but there are still pieces that you can release. So for instance, the runbooks. You may not be able to open source your overall network topology and how you ship your logs because that's the part that you're trying to sell to people but you can say if my servers go down in the middle of the night, these are the steps that I take to remediate that or these are some of the key metrics that I care about to make sure that everything is humming along smoothly. So again, it's going back to, it's not the big O open source where I wanna make the best project that everybody in the world is gonna use. It's little O, I just wanna make this open so that everybody can learn from it so that I can get feedback on the work that I've done and make sure that I am continuing to improve and I'm opening up those feedback loops because none of us are perfect, none of us are omniscient and so the more opportunities there are for that feedback and feed forward the better we're all going to be as an industry. The whole idea of a rising tide lifts all ships. So sometimes at work I'm moving very fast and I don't have time to look deeply into some of the things that I wanna use that I find in the open source world. So I tend to use those things that are pre-vetted, have a lot of information about them out there. So I wanna know like for smaller projects and things, what channels do you use to find the most innovative, interesting and useful open source things that are out there that aren't like the big name, big ticket things like chef, terraform, et cetera. Sure, so side note from that, as you can see from the naming on these, I use SaltStack for my work which is not one of the big three configuration management platforms but the way that I keep up with the smaller fish and see what's going on is I subscribe to a lot of newsletters to see what are some of the up and comers. Stackshare is a good one for that. Maybe looking at the GitHub trending but also just talking to people in the community say this is the problem that I'm having, here's where it's a problem, do you have any advice? And that's a really good way to let things bubble up that you might not otherwise be aware of. So one example of that is that recently one of the gentlemen from ThreatStack was talking about some code that he open sourced for being able to integrate LDAP with their infrastructure in a particular way and so that's something that most of the people in this room probably have never heard about but because you're engaging with your community and keeping an ear out for interesting use cases you can find out about those things as they first come into being. Well, if anybody does want to talk some more about it or have ideas or feedback, I'm gonna be around all day today and tomorrow. I'm gonna try and set up an open space for us all to talk about it and you can also get in touch with me at any of these places. All right, thank you. Thank you so much. Thank you.