 All right, good afternoon, everybody. My name is David Lu, and I am the Python Technical Consultant Engineer at Intel. So normally, a lot of my focus is on the numerical and scientific side. But occasionally, I have a big past in my work in infrastructure. So on that topic, I've come up with some interesting ways of using Buildbot. And so I thought I'd give you a chat on how I've done that. So just kind of as an overview, we're going to just talk about what you can do and what I mean by infrastructure design patterns, what Buildbot is. We're going to talk about hooking things up in unusual ways with ports and multi-masters and pseudo remote procedure calls. And then drop into when to use containers, putting it all together with Python, and then talking about some more live examples and things that I've actually implemented in the real world. So just on the term of infrastructure, I know this term gets thrown around a lot. So what exactly is infrastructure? Is it automation? Is it orchestration? Are you talking task runners or distributed task runners? I think there's a lot of confusion as to what this actually is. And many options actually exist in the Python world for what you may need. So you can have Dask, IPyparallel, and Jabla, which are more distributed task runners for numerical work. You have orchestration with Chef and Puppet. You have Celering Kafka, which are also heavily tied in with Python that are more automated task scheduler runners. So many of these examples are very heavy-handed. So they'll do a lot, but you have a lot of setup because they're not necessarily meant for this specific task. So you usually have one solution that's way, way too complicated for the job that you're actually using it for. So some examples of these really odd things where it's too heavy-handed would be a distributed task system such as Dask, which I really, really love, to run a cron job, which is completely out of the scope of what it's supposed to be doing. So it's not a good use case. Trying to get Celery to do a MapReduce operation, which it generally doesn't like to do, or trying to get Puppet to make a task graph. So these are all examples of when you basically have competing features that don't exactly fit under the paradigm of whatever task you're doing. So you're using these frameworks against what they were designed for. So to get out of that mentality of using just those big pieces, just to kind of force them into a certain direction of what you want, the way that I've approached it is to take other frameworks that are loose enough that you can use as building blocks to get that. So one of the ones that I use is BuildBot, which is normally used for continuous integration. It's written in Python and uses Twisted as its backend. It's been pretty popular because it builds the actual Python language. And I think a few of the languages like Rust. And you can construct elements of it in really weird ways. So just like Lego blocks, you can just take pieces, bits and pieces, and arrange them in an odd order to get the job done. So continuous integration or CI tasks generally incorporate a lot of interesting pieces. You have a scheduler, you have dependencies, and you usually have some sort of result at the end that you're trying to achieve. But the main task components are actually composed of other primitives that you can break up into smaller pieces and utilize if you choose the right pieces. So some of the examples of these pieces are what you see up here. Some of the more notable ones are triggers, resource pools, distributed to some communications, the scheduler itself, build steps. So a lot of these you've probably seen if you've ever done any type of continuous integration or build. So when you look at these components in your head, you think, okay, this is meant for this, this is meant for this, but what I'm gonna hopefully show you today is that you can do a lot more with some of the components that I'm demonstrating up here. So one of the things that is unusual about Buildbot is that because all of the things are split up, they give you the option to wire them in unusual ways. And that's when they reveal the upper layer to you, normally you're like, for Jenkins or Team City, you're like, okay, I'm just using it in this format. But if you start to break out the pieces and say, well, I want the scheduler reporting to another scheduler, then reporting to something else, then you achieve some design flexibility. So before I go on to, there's a huge, huge warning as someone who is formally in the security field, the examples I'm about to show implement zero security. And before you ever put this in production or anywhere internally, just understand that there is very, very little security or orchestration that's going on. So just understand that before. And this is considered a very off use of Buildbot. So don't go asking the Buildbot people, hey, how can I set it up for this really weird off the wall configuration? Because it's probably not a supported use case. So the first steps in breaking out of just the CI mindset would be that the design patterns are most, you take the most common tasks and roles and interconnects that occur in software deployment. So you take a look and say, well, what type of roles are always there? And Buildbot is, again, just one way of solving this. So if you use it, you can just prototype something and then develop something else when you go to production. So some examples of what I've actually implemented and what I'll show you as an example, I've listed up here, enterprise application deployment and license management has been two of the most popular uses that I've made proof of concepts for. So we'll go into that in a sec, but there's other things that have been done before. So let's talk about what you actually need to start doing this. So Buildbot has a lot of interesting components when you need to actually work with a worker and a master. And one of the things it exposes is ports, but you can use the way that it exposes it in a non-standard way. So what I actually do is the change port in Buildbot allows you to normally say, well, I want to trigger a build from an external source, but what I'm gonna show you is that you can use a sim link call to a Python script, which then gives you a kind of a user level trigger to trigger a whatever you want. So it gives you that very much like a, like if you map something to user bin to it, you could actually launch an application. So by passing the argument, I can use of those through that Python script, I can also kind of remote procedure call into the worker because the worker is just interpreting it as Python and the port through the password. So you can actually inject Python code, again, you know, sanitize your inputs here, but you can do some really weird things with it. And most of that logic is controlled to the master.cfg, which is majority interpreted as Python code. So here's an example of what it actually looks like. In the Python, in the master.cfg, you normally have a change source, right? And you usually ask it to pull from a repository, oh, I'm pulling this repository all the time. Okay, that's great, you know, if someone checks something, it'll update. But then you can actually add a secondary change source and that's the method that I'm about to show you is I'm adding that secondary change source on a port and then injecting commands into it. So one of the files that you can use is the fakechange.py that they ship in the buildbot contrib and what it does is it accesses that port and sends the actual build command to the specific keyword project. So if you create the scheduler for it and you create the change request for it and match the two together, then you can send that change to the scheduler and then you can trigger whatever you want, be it an application, be it, you know, a task of some type. So I know this has been a lot of talking, I know you guys would probably like to see it in action. So here's kind of an example of it running. So in this case, I've actually sim-linked some commands to the Linux, sorry, to the Python script and I'm calling it and it's giving me a result back and in this case, it's actually running one of these, what I would consider a Linux-ish shell in Docker on a different worker. So in this case, from one space, I've actually triggered the ability to land a Linux session host or whatever, you know, in a big compute center right from just this simple pattern of injecting that code all the way into there and I'm then asking it to then build but in the build I'm calling the containerized application to then run to the user and in that example that just flashed before you, I showed the actual project command that was being used in it, which I think was right here. So I was calling it to then say, I want you to build this project but actually that project is a bunch of Python code that's then calling the Docker application. Next, you can think outside of just the standard build process as you can start going into multi-master which creates some really, really odd situations. So you can go from one worker then being load balanced into another worker through either a multi-master setup or a resource pool-like situation. So one of the things that I like to say is don't hesitate to kick off another one of the subsets of build bot instances. You kind of have to just trust that. You've created an abstraction that mostly works and you can just kind of play around in that space but occasionally you will have to break out of just a single scheduler or master to achieve the task at hand. To also expand upon what I just said, what type of commands I'm actually injecting into the build bot worker is this example. So in this case I am hard setting this display variable so you can see if someone's display was passed as an argument during that fake change script over, you then assign that to the command that you're calling on the worker and then you will be able to call the entire Docker command. In this case, the top one is for Emacs and the bottom one is for the kind of retro terminal that I was showing just a second ago. So again, I can have these intermix even on the same worker with the same build script because you're essentially tying each of the commands to a different project of what build bot will consider a buildable project. So the reason that I'm even showing parts of Linux containers is that they have the advantage of being great abstractions to things that don't normally fit. So when something really doesn't wanna fit together or you have a bunch of dependencies, you can actually use the Linux container systems put together to actually equal a task and you can do that through Docker, you can do that through any like Rocket or any of the classic container technologies. And in the case that I was showing, I was actually showing Docker with clear containers underneath that prevent it from getting out of the KVM escalation. So with containers, typically what you wanna do is you wanna cordon off the riskier bit. So if you don't want things escalating out or you don't want users mapping things to the volume when you're using the container then or the application in this case, then this is one way of going it. And another thing that you can actually use is to provide privilege and non-privilege barriers to separate users. So you can make sure that that's an application that's very dangerous to something can't escalate out into something else. And at some point you may need orchestration to pull off tasks. So just know what responsibilities you want and what technologies and depending on the problem it depends. Orchestration, you could, most of the situations I've never had to actually use orchestration to pull something off because I've seen some dangerous things with Puppet. But if you need it or if your organization uses it then it's something you can consider and like what I was saying before, if all else fails just bundle everything up in a container and automate that as a single component. So in the concept of Python, the reason that this does have a critical component in Python is that the code that you're passing from even your script level application that I was showing the fake change that was sim linked, you can pass in Python code that gets passed into master which then gets passed in all the way down to the worker. So you're essentially hopping that code from two or three machines to wherever the resources are. So if you have a very specific build of Python or NumPy, you can pull it off pretty easily to get that command over and Python is, as we know, one of the best glue code capabilities in infrastructure. So that's another reason that I think Python has been a very critical piece in a lot of the stuff that I've designed. So I'm gonna show off two things. I'm gonna show off the how a company-wide server application deployment works and that's actually, again, using the same mechanism which is sim linked Python scripts and then we have the license server for a floating license. So if you have, say, an application that has a floating license, but it can't get out because of a company proxy or firewall, then this is one way that you can actually balance that license without having contention on the given application resource. So the way that I've designed this first example is that you have something, a user bin that's mapped to the port change that I just demonstrated and that talks to another server. So not like your user computer or the login node, you're talking with something with full privileges and it's running the application for you and handing back the x-forwarded screen which becomes your application screen and you can update the applications basically on the back end through any type of repository system. In this case, I'm doing it through Docker repository and stuff and you can spin up new workers if you need more instances. So that's kind of where I've heard orchestration comes in handy is if you can do pseudo load balancing with it as you detect stuff. So in this case, what I'm gonna show you is how I set this entire example up and the first thing I do is I run the, I have a Docker compose of the three containers that are running both the master and one of the builders. So I have both a master, a builder and then I have two workers that I'll individually start up. But in this case, I'll start showing you what the worker looks like and the fact that Buildbot does come with a GUI. So when you wanna look at the dashboard, you can actually see what's going on and interact directly with it. So for your users, if they're doing stuff that requires direct application launch, it might be useful. And in this case, I'm going to, it's local, it's local page and you can look at which one of your workers are up, which tasks can be accomplished with which workers. The concept within Buildbot is that you tie very much the resources and the capabilities of giving a worker to a specific job. So in some cases, maybe one worker can do two of the jobs or one worker can do three of the jobs depending on what application you want it for. And so in this case, you can actually see that I've started it up, but the workers are offline. So really I can't do anything until the workers come up. And so you can simulate that as saying, well, I don't have any privileged resources of where those application servers are actually running. And so I have to go start those up. So in this next part, I'm then going to start this up and notice how dangerous this exhost plus is. I'm just gonna point that out here. I'm starting up each of the workers and you'll notice that they'll connect to the master instance. And so in this case, they're communicating with each other and hopefully when you look at the main page, I think I screwed up over here somewhere. Yeah, okay. When you look at the main page, you'll see that both of the workers are online. They're now green and they're showing that they're capable and ready to work. And I think, yeah. So you can see the two names of the workers. So say you have a given application that can only run with that particular machine or has a given processor or mechanical or computer setup, then yes, you would need that. And then I can actually start the job through either the change port or I can start it from the GUI. And in this case, I'm running Emacs and you say, okay, that's cool. That's just Emacs. Well, actually this is running inside of a Docker container that's then being controlled from the actual build bot worker. So its ability to manage these is really useful when trying to load balancer applications or prevent or being able to administer your applications if you're an IT administrator of some type. So let's see. So the next one, hopefully, will show a little bit more information about what you can actually see from this side. So say your application, you can see who ran it for how long. You can see how many times it was run and who's currently running and how many things are being used. So I think all those are really important. And again, you can see the SIM link to call. So in this case, instead of calling it directly from the GUI from the dashboard, I'm then going to call it from the same way through the command line that's been SIM linked all the way through. And again, you can see it's started up and it's running and you can see the current state. So in terms of being able to make proof of concepts very quickly with this, it's super useful when you want to see what's going on underneath the hood especially with the dashboard. But again, I want to stress that this is really, really just super useful for proof of concepts and that maybe this would be better if you migrated to something else as you figure out what your needs are as you go into a production system. So again, you can see that these are two completely different Docker instances of Emacs running. And as we close them, you can start seeing them cordoned off and everything. Cool. So in terms of a floating license, right? So you don't have to do too much different in that previous case to now do a floating license. So the license can either be held by the database that's attached to the Buildbot master instance or you can use some type of actual build logic within your build scripts to accomplish that. So again, it doesn't seem too different from the previous example. And I'll hopefully show it here. I'll see if I might have to fast forward this in the sake of time. Okay, so I'm gonna force one of them to start immediately. And so let's just assume I have two pools or I guess I have one pool with two slots, right? So instead of those being machines, they're now instances cause you can start up the Buildbot workers on your whim. So maybe you have one machine that has 50 of these workers and that becomes your resource pool, right? So now I have one started up and now I'm gonna try to start up a second. Let's see, I might have to fast forward through it. Okay, do I get it eventually? All right, cool. So once the second one starts up, you'll notice that both of the workers have now been taken up. And so right now they're flashing yellow because they're both being taken up. So if this was your resource pool or this was your license, the amount of licenses that you had for the given application, you can then block the next time and queue to the next person that actually accesses it. And then when that other person closes it, the next person in line gets it. So in terms of queuing for the users, this is really useful in kind of an enterprise-ish environment where maybe someone says, okay, they're gonna be using the application until lunch, I'm just gonna queue myself, I'm gonna go to lunch, come back, and hopefully I'll have landed my session by then, right? So if I take one out, I can then queue one, which I just showed, and then the next one will start up and now both people, again, the next person in line has it. So you can use, again, the thought of this pattern, what exactly here is abstract? The workers are abstract as either pieces of the resource pool or they can be used as another part of the whole as a license server or how many licenses you have or you can spin them up on demand based upon load. So the types of things that I've actually done in industry to create a proof of concept or give maybe a small company one way of actually creating their own internal infrastructure is by using something like this. So we've done it, I've done a session, a compute session Linux handler. So you can then use the X1140 like I just showed to pull that off. I actually have like a home machine learning server which I use to kick off jobs and it constantly computes geographic data as it's updated. And so to go back to the previous example of one that worked, the compute server was a little more complicated than the one that I showed in the video. So we did have a login node that we had to model that was a smaller machine that had smaller processes and then we had one that was a large server farm and we used those that had full resources and privilege to have there with the workers and then with BuildBot Master and the session queue you would keep those for the people who are logging into that node. And so with the landed sessions, we were able to scale this I think to maybe about, I think the most I've ever seen was maybe three or 400 users and this was on a very small setup like with a login node of maybe one machine. So I think that in terms of making a proof of concept for what your infrastructure should be like then it would be useful at that point and then as you try to scale it up to something else you can look at other types of infrastructure components to pull that off. The machine learning server that I implemented utilizes more of the continuous integration components. So I actually have a BuildBot Master again with the session queue. The worker that's basically running all the time actually pulls from a live data source and then when things have updated correctly and it gets enough aggregate data it then sends it off to the compute workers and then posts that back on my screen and that's all automated on my like, right above my couch I have this monitor that runs all this stuff so I can see what the aggregate data for that thing is. So again this is another utilization of the components but for a completely different purpose. So in summary with a little bit of ingenuity and creative use of components you can really pull together a lot of infrastructure design patterns that appear in like software and IT with creative freedoms because as more of these weird frameworks and technologies get released your ability to create new and exciting infrastructure patterns just get better. So being able to rapidly design these things with this method allows you to look at inherent problems and maybe the way you've set things up before you go to a full scale set up and I'd also like to state that remember the examples here are not shown with any security or orchestration so just be very wary of that and then one of the things that you can do also is just keep an eye on the upcoming technologies that show up. I remember trying to do this before the days of really trendy livings containers and it was pretty difficult in some instances and that just opened the doors up for other things. The orchestration technologies opened up another door of things you can possibly choose to do and then you can experiment with new tools often and see what patterns can be made next. Okay, that's just the build bot and questions. No questions? Questions. Okay, we have one question. Thanks. Do you have any experience with a build bot Travis which is providing a Travis YAML file integration into build bot? No, but I have started looking at experimenting with it. I was really excited when I wanted to make some of my own open source projects built with it and I haven't really gotten the chance to experiment with it but I do look forward to experimenting with it. Another question? There's one here. Great talk, thanks. You mentioned that a lot of these are proof of concept and the security needs to be worked on. Are you gonna pursue that at all to give people tips on how they could secure it or is that kind of left to the user to figure out? I mean, I think with how different, a lot of infrastructures are with different companies and this is more in the domain of IT security which is less of an area that I was in. I would really just, I don't have that many tips to give. I just know that I've left a lot of holes open and I'd like to trust my network IT guys and security guys about what they think should be done in this case. Any more questions? Thanks David. Give him a hand.