 I'm here to tell you about Chopsticks, which is a new library I wrote over the past year for running Python code on remote hosts. I'm Daniel Pope. For the purposes of this talk you can call me Daniel Sam. We're going to go through what Chopsticks is, how to use it and then take a little look at some of the tricks that make it work. Chopsticks, fundamentally, allows you to run Python code on remote POSIX machines in parallel over SSH with nothing installed except the SSH server and Python, system Python. So it behaves a lot like I've been calling it multi-processing over SSH for simplicity. It can also do some things with Docker, so we'll explore some of the parallels between how I'm using SSH and how you can use Docker, sudo and other things. This is a Chopsticks script, so I'm importing my Git management function, so there is a Git revision function somewhere in the code base. There is this group object, and I pass a list of strings, so if you just pass strings, then they are host to SSH2, and then we can call that function on all of the hosts and then loop over the results. I always like to show the kind of implementation of the function, so that's how you might write a Git revision, it's just a sub-process, and you could write that better, I'm sure. So there are some restrictions to the code that you can run on remote hosts. It has to be pure Python, the function parameters are pickled, so when you are running cool as they are, you pickle the callable that is passed to the remote side. The return value must be JSON serializable, and the reason for that was a kind of security concern that maybe your server gets compromised, you don't want your laptop that is running these orchestration functions to be compromised, and pickle is not safe for that. So let's look at some of the things we can build with tunnels. So I've got tunnels for SSH and Docker, so this is not like Docker on Kubernetes, this is like spin up a Docker container, by default it will be a Docker container that is destroyed when the context is left. So here you can see it being used as context, I think in my first example I wasn't using it as a context manager, so the context manager closes the connection at the end of the context, and of course that means that any remote state is torn down. Well one of the connections stays up, of course you've got a Python interpreter on the other end that could be doing anything, you could be sort of spinning out sort of demons or doing networking. Sudu runs as root, or as another user, and local is just spin out a Python sub process, which perhaps is the least useful of these. And if you pass in any of those tunnels to a group then you can run code on the tunnels instead of SSH, and so this is an example of using a group to execute code on multiple different Python versions, and I should say for this you need nothing to be installed apart from chopsticks and Docker, and obviously Docker will fetch the images and run those. Okay, yeah, typo, yeah. So there is, when you're doing this of course you are running, so if you're doing this specific example where we've got Python 2.7 and Python 3, the code that you're importing of course has to be code compatible. So it's a bit of a fudge trying to call code from, so if you're calling Python 2 and pickling objects from Python 3 then you can expect to see strange behavior where, you know, sorry, it's the other way around. If you're calling Python 2 to Python 3 then you will expect to see stuff in Python 3 that is bytes, and that's, so that's just the nature of pickling. So this is not without pitfalls. Groups as well have the ability to, they act as sets, so you can create, and these are not connected at the moment you create them, so this is just sort of the all hosts operation here is offline, and then the tunnel is automatically connected to all of those hosts to execute isUbuntu, which is a function. So you can, so you could use, you could define in one module all of your hosts and then you could dynamically select some of those hosts to run operations on, and that's an implementation of isUbuntu, it's not very good, but it fits on a slide. So let's look at all of the operations that the tunnel supports. This will be the kind of wax on, wax off of chopsticks. So every tunnel supports these operations, so there's a like explicit connect, which if you run it on a tunnel it will raise exceptions, so a tunnel will always raise exceptions when a thing fails. Groups don't do that so that you can get the successful bits and the partition, the successful bits and the error bits and do your own error handling. So put and fetch in particular exist to support streaming of large files, because you wouldn't want to pick all the entire thing and store it in memory while it's being transferred and close, and the context manager is explicitly shutting down the tunnel. So that's an example of using fetch to fetch a file from a remote host to a fetching password file, so perhaps you don't want to do that, but the fetch for a group in particular has some special behaviour where you don't want to overwrite the same file for all of those hosts, so it's keyed by the name of the host and it will construct a unique path to write those files into. Apart from that, they behave exactly the same on a group as on a tunnel. And also note there there's the raise failures, which is just a quick way of handling exceptions for a group. So far the operations we've seen are synchronous, so you make the method call and it will block until, for a group it will block until all of the hosts have calculated their result and sent it back and then it will return you the response object which has all of the responses from all of the hosts. To support the parallelism in a group, Chopsticks is async under the hood, so there is also an API to make use of that which is called Q, which is my attempt to create a synchronous API to deal effectively with the asynchronousness. So in typical operations you might just want to speed things up and so you can queue up a ton of operations and say run all of that, so in this example you can compare the, well so we're putting different operations on individual tunnels, we can send individual operations. So rather than a group will always pass the same parameters to the function for all hosts, whereas in the first example here we can pass different parameters to different hosts and then run them all in parallel. All of those operations incidentally return an async result and you can attach callbacks and so on. In the second example you are, by passing a group as the thing that you're calling the operation on it will just do all of those things in parallel. So that's actually sending each of three files to each of three hosts and if you can imagine how that affects performance because there's sort of variability and actually there could be a very wide variability if you're sending to different networks. Then by sort of squishing up all of the gaps the version with the queue can complete faster and I could have made that example a bit clearer but you can notice in terms of time the queue example finishes a bit sooner than the operation with group and as you add more operations and there's more variability that would go a lot faster. OK, so slide demo time. Chopsticks also work straight out of a Jupyter Notebook. So OK, this is going to be really hard because I'm looking up there. OK, so conference Wi-Fi permitting. I have a connected tunnel to, that's just one of my servers. Sorry, it hasn't connected. It's defined lazily so we'll see if it connects when I actually call a function. So that function is defined in a cell. OK, so that's the number of processes running on that server. So to make that work there is some special support for pickling or serialising callables that I'd find in under under main and perhaps you don't need to care about that unless you know what it means but that is actually sent as source, not as. So normally the imports work by, I request this module, send me the code for it and it uses the import system to find that. In this case it's just sending the source and anything that it relies on so it knows that that relies on OS. Sorry, so yes, I should say like this all works with the standard SSH binaries. So I've configured my SSH binaries to have local key authentication, so that's very private key authentication. So I can automatically log into all of my servers password-less. That's also the case in where I work. We have carburised SSH and I should say that chopsticks doesn't really, doesn't deal with the interactive bits of SSH. So this all sort of presupposes that you have configured your SSH infrastructure to be password-less, which I highly recommend anyway. OK, so I wrote chopsticks last year, I came up with the idea because I was grappling with Ansible and Ansible. Ansible is a great tool for what it's designed for and I was not using it for that. So I started to think of ways that I could do something better. I'm actually going to talk about Fabric, which I used before Ansible and that's why I didn't go back to using Fabric. So this is a Fabric script and I think there's probably a couple of imports that are needed to drive this. This is a recipe for Fabric that I pulled straight out of a Fab file that I've written and you would run this with the command Fab Upload. So what you're actually writing here in most of the operations is bash. So almost all Fab files are just using mostly run and a little bit of rsync and a put and bash is fine for some things. If you want to do anything more complicated than that, you get back into territory of dropping a script over and evoking it and shipping the results back and parsing out the results. You don't need to do that in chopsticks. So compared to Fabric, chopsticks has a lot more flexibility. It doesn't have the same kind of execution model. It doesn't have the fab this restriction. You're writing just whatever Python scripts you want to write. Also Fabric, it does have parallelism, but it was much more of an afterthought. So it works by creating a multi-processing pool and running the operations on that. And there is no persistent state on the other side. So if you do want to rerun things or run multiple commands, then you're having to restart a process every time. It's also wedded to SSH. Fabric uses parameco. Parameco is a Python SSH library, and that means that it can only work with SSH. It is of SSH, whereas chopsticks is pretty much agnostic about SSH. It's using SSH binaries, but as we'll see, it's just wrapping those and it can wrap Docker in exactly the same way. Ansible. Well, this is not Python. So this is your main interface to Ansible. People cry about the fact that Ansible is written in Python, but actually it is a programming language written in YAML. Almost literally, you can step through the sequence of operations here and it will substitute into the... So you can get variables out, you can assign values to variables and then substitute them into the next operation. So it's like a much worse bash. The Ansible module API, when you actually get to wanting to write some Python codes to extend Ansible, this is a little bit more ugly. I've wrestled with trying to write things for the Ansible API and it kind of works, but I came to the conclusion that Ansible is all about the YAML and it's not about Python and I'm a Python programmer. So by comparison to those, Chopsticks gives you the ability to write normal testable Python code. You can test a function locally and you can have quite high confidence that it will just run remotely on a remote system. You can document it. I've got a project at the moment where I've written a class that I use both locally and remotely, so different parts of the operation. And it's, as you can see there, no deploy step. It's not opinionated. I've used it in. I've used it for writing things, for monitoring to... It's currently a deployment tool. So there are a range of things as well as configuration management operations. So there are really very few restrictions as to how you can use Chopsticks, whereas with Ansible you can only write one script, for example, in Chopsticks you have your entire code base available to import on the remote side. So how it works. There are three tricks and so the first trick is to, first of all we have to get a Python process running on a remote side. We have to exchange, so it's effectively RPC so we need to make requests to it and we need to get responses back. And then we need to have it able to import code. So how do we get a Python process running? Well we can give a minus C option and pass a little script in the command line. And actually, because we don't want to send 10 kilobytes in the command line, that command line just reads a larger script and then once we've got the larger script we can proceed. So there was a file called bubble so it actually makes my metaphors. So most of the metaphors in Chopsticks are our Chopsticks. This one was about remember those putting plastic rubber on a pipe and then you blow over. So we inflate a process on the other side entirely with code coming from the orchestration host. And the command line on the first example is the kind of command line where we feed it like a one line version of the bootstraps script and then we pipe in the full script. And that's prefixed by SSH so that's just running that script on the remote host and then because we got stood in and stood out I can feed it the requests and receive the responses over those pipes. So the first thing that will happen over the pipe is it sends the bootstrap script and then that bootstrap script changes that pipe into a message passing protocol and it sort of moves the stood in and stood out out of the way and then we've got bidirectional communication to the remote host or the remote process which could be on the same host. And that's the kind of message protocol. It's just like a standard binary protocol with arbitrary sized data and a message type and a request ID for convenience to sort of dispatch it to the right thing on the remote side. Within that there are serialization protocols so when you're uploading a file that data is just bytes of the file and the operation is put bytes of the file. When you're calling a function the data will include the pickle of the code that you're sending. And then the last step is importing code back from the host. And so for this the RPC goes the other way. So we make a call to the remote host and the remote host says in order to unpickle this function for example or perhaps the function contains an import. I need to import this module and so there is an import hook which is a capability in Python to customize the way imports are done. And it will go back to the orchestration host and say have you got this file and the orchestration host will send it. So importing on the remote side just goes through this import hook and all imports of pure Python stuff will just be delegated to the remote host. Apart from hopefully the standard library will come from your deployment of Python on the remote side. So having made all of that work I've sort of moved on to doing some of the... I've unlocked lots of ideas for myself in how to extend this. And one of the ideas that I had was to enable this kind of configuration. So experience where I work suggests that you can run maybe 20 SSH processes comfortably in parallel on one of our cloud boxes. So we have many more hosts than that. It is already possible in chopsticks to import chopsticks and make additional connections. So you can call a function that is defined to use chopsticks and you can... So one of the things that I'm hoping to write is a full proxy for chopsticks that is entirely bootstrapped dynamically. So there's a project that I'm trying to work on at the moment where I need to connect to 750 hosts. So I should just be able to connect to say 20. Pick 20 at random or define another strategy for selecting 20 hosts and connect to them with chopsticks. And on all of those hosts import chopsticks and connect to each of them connect to 20 more hosts. And that's giving me 400 hosts and maybe I can tweak the numbers a bit or go deeper. So spanning tree of connecting to hosts is potentially achievable with obviously latency trade-offs and so on. And I have wrapped. So SSH was the reason I invented this. I wanted to do something that worked a lot like Fabric and Ansible and Docker. I just suddenly realized that Docker gives me access to stood in and stood out as well to a Python process. And then sudo gives me stood in and stood out as long as there's no password prompts. So what else? I think there are loads of other things that could give me a Python process that is in some other context and feed my code to it. So here are the links. I've got extensive documentation on Rhysdox. The project is on PyPI and on GitHub. Please star it. I've also got stickers. So if you would like a sticker. There are also some, if I run out there are some on the... Okay, I will open those up in a second. And I will be sprinting on this at the weekend. With that, I will take questions. Okay, thank you so much. So first question. Thank you for your talk. That's great. I was just wondering if you, let's say, have an infrastructure with a diversity of servers and you're using the SSH tunneling. And it's... If you have different versions of Python running on the different machines and you may have something not working from some of them, do you manage when you return and you get the, you know, the thing done? Do you manage to analyze what's going on on different servers? Do you know that something has failed? Okay, so I didn't show error handling but I can... If I... I can't get it mirror my displays and so let's hope this works. It doesn't support that resolution. So can we try to answer that without the projector? Okay, well, so I was... Yeah, the... So when you are using a single tunnel it will raise exceptions. When you're using a group, the result object you get contains the successes and the failures. And the failures contain a stack trace. And the... So there is no access to the actual sort of frame objects. There's no access to the frame objects but you do get a stack trace and if you are recursively tunneling you'll get a stack trace that is formatted so this stack trace calls this stack trace or something like that and so you can pick apart what happened. Actually the responsibility for error handling is mostly on the app that's using it and chopsticks is not opinionated about how you might want to surface those errors. So the answerable model for error handling is that when you get to an error on a host you stop processing that host and like the rest of the playbook will run on it you might want to try and recover. Okay, thank you. There's more questions coming, right? Yes, thank you very much. That looks awesome. Two very short questions. Is there a simple way to pass environment variables to the environment to which I'm calling and will this work if the client machine is Windows with the working SSH binary? You cannot pass environment variables but you could call a function that modifies OS.envaron. Windows, there is basically no support for Windows. However, the remote side has been written to not use the kind of async stuff that is on the orchestration host. So it would be possible to create a chopsticks agent for Windows. I think if you configured your SSH, you configured a Windows SSH server and Python correctly you might be able to get it to work right now. But I think that would be an opportunity for future work. But I don't have anything on Windows so I basically don't intend to do that. If anybody else wants to contribute then you can join me in the sprints. Thank you. Hi. Thank you for this awesome library. It's better than Execnet. In my previous team we've used it for testing automation where we piped stuff through Docker Exec. The one thing that was kind of difficult is that obviously you cannot push closures through because the remote side can never get the data it needs. But perhaps something could be done like a snapshot of the variables that the closure needs. What do you think of that? Is that a good idea or what? So I think actually the way that the serialisation of those Jupiter cells works is kind of similar to that. It takes a snapshot of the globals and the method code. I still think it would be quite difficult to do for a closure. What you can do is, so partials are pickleable after Python 3.0 something. So pickleables, there was once a bug where functools.partial was not pickleable. It is now pickleable so there is that. Also you can, if you create a class you could, a class that is pickleable, the instances would be pickleable and a bound method is pickleable. So there are ways of coupling data to your callables as well as just passing the parameters when you call tunnel.call or group.call. I am not sure if there are related questions or not. First, if I understood you correctly, sometimes functions are shipped over the wire in source code form and sometimes they are shipped over in some other form. When do you choose which form and what was the other form? In, if your script is running as under under main, actually if it is under under main, the callable is defined there. So callable.under under module equals under under main because those are not pickleable by default. If you can, you pickle the function and that is what you ship over the wire? So my other question is, it seemed like you had something very spiffy about how you could run multiple Python versions depending on what you wanted necessarily on the remote hosts and you even were crossing the streams between two and three. So there are differences, there are very important differences about pickle between Python 2 and Python 3. Are you just not paying attention to that or are you very intelligently converting things between the pickle formats? And what about bytecode differences and things like that? So the implementation now, one of the first things it does is asks what's the highest pickle version supported on the remote side and it will use that pickle version. So, and actually to avoid some problems when you do it in a group, when you connect a group it will choose the lowest common denominator pickle version for the entire group. Does it make sense? Okay, thank you. More questions coming? Yeah, okay, basically my question is a variant of the two previous questions but have you considered sending bytecode instead of the inline source code? We will have the same problems between Python 2 and Python 3 but maybe it will go to... I don't think I'm going to try that, I think there are too many problems in trying to use bytecode compiled for one version of Python with a version compiled for another version of Python and I should say that even when you're not in the expected use cases for this, like when you... In the expected use cases for this, I'm trying to do two things at once, you're using the system Python interpreter so use of in Python 2 or use of in Python 3. It actually tries to match the... Python 2, Python 3 to whichever system Python interpreter you have available. So it is very likely that the minor versions won't match and so trying to sort of pass bytecode is doomed to failure. Thank you and one last question is here. Hi, would you also consider implementing some dummy server instead of SSH server in order to, for example, enable communication with Windows? Yeah, well, that's why I've already answered that question but it's far down my roadmap but it is something that I thought about and the code that I have written is sort of compatible with that aim. That's the answer to the exception handling question. You get a remote exception that has the... So these paths are paths that's code imported from the orchestration host. Okay, thank you so much.