 If I please give him a very warm welcome. Thanks, everyone. Okay, my mic is working. That's good. So, yeah, I want to basically give you a really quick whirlwind tour of how to call into C code really quickly and comparably using CFFI. I'm going to start out with an admission, which is that when I began programming and still pretty much to this day, I was and am terrified of C because for most of my career, I've been using languages where when you get something wrong, you get a trace back that kind of roughly points out where in your code you made a mistake and whenever I've tried to write a C program of like marginal complexity, sometimes not even marginal complexity. Sometimes it's a hello world and I just like, I'm like, oh, I wonder what this does. And you just get a segfall and your program exits and you have no idea what happened and GDB and Valgrind and all those things, everyone's like, oh, use those and this is so confusing and no one explained to me what they were and I felt like it was an in group of like, I don't know, but I keep working with people who will say like, oh, we can use Python as the glue code and business logic and then all the slow bits we can just rewrite and see and I really wanted to sort of know, first of all, they make it sound really easy, is it? It's not that easy, but I wanted to figure out, okay, how can I actually do this because I know there's successful software out there that does actually have like the slow bits implemented in C. And so I want to just go through a quick rejoinder of why you would want to call out to C and C++. The big one is to make things faster and there's like a lot of algorithms that will benefit from being implemented in C, particularly the things where you're doing the same small operation over lots and lots of data, serially. So string stuff is pretty much that, numerical stuff. Calling out to system libraries is a pretty big one. So sometimes, especially like OS X, for example, or Linux, like anything you want to interact with device drivers, anything you want to do, like deal with the windowing system, SSL and crypto is a big one. You do not want to be implementing this yourself, so you definitely want to be using a system library for that. And then it turns out like a lot of, you know, C is the lingua franca of modern computers pretty much. So when there's like a new file format, a lot of the times, it will be implemented in C or C++ first and that's the reference implementation. And so you can get quick access to that from Python, especially if an official implementation that hasn't been provided. And then a really interesting thing is if you are actually writing a C library and you're not terrified of it, or you are terrified of it, but it's your job anyway, you can write the tests for your C library in Python. And that is actually like a really nice way of doing like generative testing, because writing unit tests in C is probably like an order of magnitude worse than actually writing the C code itself. There's like so many weird macro things that you have to do. So you can actually make that a little bit easier with Python. And there's a few existing solutions. C types comes with the standard library. You have to kind of set up all the bindings in Python yourself. It's all happening at runtime, which means there's a little bit of runtime overhead. Basically the conversion of things from Python types to C types like integers and structs and arrays and all that kind of thing. You're telling the Python interpreter at runtime it has no ability to make things more efficient in terms of calling out to that library. Swig, Python and Pyrex, it's very widely used. You have to write a wrapper in a third language, which makes things a little bit more work for you to do. It's something you have to maintain. And getting it to work with Python 3 and 2 in one library is actually pretty difficult. And then finally you can always hand write a C extension but then throw compatibility out the window. It's going to be really hard for you to write something that works across multiple Python interpreter versions. It's only going to work in C Python and there's going to be a lot of like manual code writing. So CFFI is a great solution for you want to call out to a C library. You don't want to write a third wrapper language. You want it to work in C Python and Pypy across Python 2 and 3. And you want to be able to avoid the runtime overhead that C types and basically there's a library libffi which is how you call out dynamically to like C APIs. You want to avoid that. So CFFI was written by Armin Rigo mostly as part of the Pypy project as determined by just like how many mercurial commits on the project were written by him. So I'm going to go over a really small example pretty quickly of how we would wrap an existing C library that implements red black trees in Python. And red black trees are actually really interesting because there's something that is like sometimes it's a canonical like joke that you would be asked to implement it in a programming interview and we're not going to try to implement them ourselves. There's a ton of really, really good C implementations out there. But what they're really useful for is anything like a set or a dictionary where you want to maintain the ordering of the elements in it. So a set or a dictum Python, you can walk through the elements and it's going to mean like a pseudo random order. So in order to like sort the elements of a set or a dictionary in Python like the default implementations, it's going to be like a completely naive sort. If you have a red black tree, you can for example like walk through the elements in ascending order for like a trivial performance overhead. And it's also a simple enough example for this presentation. So I'm going to be going through how to wrap a C implementation of red black trees in Python with CFFI. So there's four main parts to a CFFI library. Setup.py, which is pretty standard. There's a couple of things you might want to note that are different in CFFI projects. You need to write a build file that basically glues together the definition of the C API that you want to wrap and sort of tells CFFI how to find the C source code and how to build the C source code that matches those like header definitions. Sometimes you're going to include the actual C source code, which is very easy when it's an open source license. You can just distribute it along with your project. And then you want a Python library because unless you're wrapping like really basic mathematical functions or string functions, you probably want some like Python object oriented abstractions over the raw C library. So the bulk of this, the rest of this presentation is going to be actually how to, some idioms for wrapping standard C, like idiomatic C data structures. But let's start with setup.py. So this is a pretty standard setup.py. The thing to highlight is the setup requires is maybe a new keyword argument for you in the setup function. And what that actually says is not only, so you've got the install requires below it, which is showing it that it needs CFFI to import the library. But setup requires says you actually need CFFI to install the library. And that's because it's going to need to like fetch the C source code, figure out how to compile it on your machine, build all the headers, et cetera. So you want to put the setup requires in there, install requires obviously. And then finally there is a CFFI modules argument that tells it, so that rbtreebuild.py is our build file that we're going to go over in a second. That's basically telling it, you know, there is a variable called ffi builder representing a CFFI module in this file. And that's what marries the like C function definitions and structs and all of that kind of stuff with the instructions on how to get the C source code and build it and install it in Python. So pretty simple like setup.py, not too many differences from a standard library that you want to package up. So now we're going to go on to the build file. So this is that little bit that glues things together. So you're going to import, oh sorry, you're going to import ffi from CFFI, create this builder object. I like to just define the header as a string at the top level because otherwise like if you can call it, you can put the string as like an inline argument to the function call that I'll show you in a sec, but it's nice to just have it at the top level for like indentation. You're basically going to write out this like restricted subset of C header files. So if you're familiar at all with C, you'll know that, you know, for example, RB tree insert here, oh there we go, I've got pointer. So this is just a function that returns an integer that takes like a pointer to the red-black tree struct and takes an opaque pointer, that's what void star means. It's a pointer but I can't guarantee what's on the other end of it. That's the value. And so you're going to have a bunch of these for any given library pretty much. One thing to note is that for a struct, if you genuinely don't care about what's in it, but you kind of just want CFFI to know there's like some kind of struct that's being passed around here, that ellipsis isn't mean just being like, oh don't worry what that means. You can actually put dot, dot, in the thing and in the header and CFFI will fill in the gaps using the compiler. It will like figure out what that struct actually looks like. So that's nice. And then continue, you're going to take that header string that you've defined, call cdef on the CFFI builder. That just says, okay this is the header that I'm setting up that I want you to expose in Python. And then finally this is the bit that basically tells it how to compile the C source code itself. So these arguments are actually really similar to what like diskutils and set up tools set up would look like for a C extension. So, oh yeah that's the cdef bit. So we're actually going to say create, I want you to create a Python module called rbtree.underscorerbtree. And this is following like the Python convention of underscore name for like a C extension. So you're telling it the module name that you want to build. You can put like arbitrary C source code in here. And here we're going to define a macro because it's actually a really good place if you have an external C library that sort of can be controlled with macros. You can just put hard code those macros in this file. I'm going to tell it look in source root which is like a variable wherever I in my source tree want to keep my C files. Look in source root for a file called rbtree.c. And that just says, okay, I want you to compile that file and what you can expect is for it to have all the functions that I've defined in my Cdef header. And then finally we're going to say ifname equals main ffibuilder.compile. And that means we can just run this file and test our build pipeline. Just test that like it gets the headers, it can find the C source code, it can build it and everything is there that it expects to be there. And then finally the C source code almost doesn't matter if you're finding C source code from somewhere else like this isn't a C conference, I'm not going to really try to walk through how to implement this red-black tree in C. But you just want to find one whatever you want to do in C like just it almost doesn't matter what's in the C file as long as ffi, cffi can find it. And yes, so once you've got all that set up, you're going to run like pythonrbtreebuild.py that build file, that's going to call that ffibuilder.compile method and it's going to do a bunch of compiler output and if everything works well in that rbtree directory because you know we created a module called rbtree.underscorerbtree it will output, we want to create that init.py file ourselves by the way but it's going to output rbtree.c which it's going to take those C, the C sources you pointed to and add a bunch of python-specific stuff and then it's actually going to build the C extension itself and so from that point that this file is actually importable as a python module. So you have your sort of build toolchain setup. Oh wait, I see you. Oh cool question. So actually I think you might be able to because what you could actually tell it is build a C file. Oh so there's actually a couple of modes of working with CFFI. One of them is you can attach to an existing C library that's already built that doesn't have the Python findings built into it. The only problem is then you're going to have the runtime overhead of libffi. So the question is just about whether you can use like an existing library file. Oh wait there's another question. So if you're if your C source code requires a special invocation of make, there's a few ways to approach it because C is just like really complicated ultimately like it's actually really simple which means that all the stuff around it has to be really complicated. So you might be better off building a library and attaching to it at runtime and having that runtime overhead if it's really that complicated. Otherwise you have to figure out how to mesh your C build pipeline with effectively like the distutil setup tools style of telling Python how to build a C library. So I can't get into that too much in this presentation because it's like a broad audience but there's you know if you have something that's complicated enough in C that you've built out like its own pipeline you probably are able to answer that question for yourself better than I can right now but that's kind of how things go in C world. I think a lot is like I can't help you because this is why I'm a Python program and not a C program. It's so complicated. Oh yeah and just to cycle back so like you've built that arbitrary module that like C module you can just import it straight away. All of them have two objects by default ffi and lib. Lib is going to have all the C functions that you exposed available on it. It represents like the library. The ffi object itself like has a bunch of methods for doing things like creating coercing types to and from C types, dealing with like creating arrays, deallocating things. So that's going to be like a kind of a support module and then the lib is all the stuff that's kind of specific to your library. So that's just like a little interlude of you know how to use this built module. And you can even jump in like at the console and you know create a pointer to like create a tree and you get a pointer to it and it's going to be this like cffi pointer type in Python and then you can call like lib.arbitry size on that and that should return you like the number zero because you haven't put anything in it yet. So that's a little interlude there. And then I'm just going to go through really quickly how you would actually wrap idiomatic C data structures from Python. So the C object lifecycle like a pretty standard pattern for C libraries that manage data structures is that there's some kind of like create method that returns a pointer to the data structure. There'll be methods like insert find and remove that all take various arguments and operate on the data structure and then finally some kind of de-alloc method that destroys all the memory associated with that data structure at the end of the day. A couple of things I want to point out. There are callbacks in C. A lot of things take callbacks in this case for example the create method for the red-black tree is going to take a comparison callback that actually like you know when you when you insert nodes into the tree it needs to figure out where to put them to keep everything ordered and so it's going to need a callback and for Python objects that means you need to come back into Python land to compare those objects. I'll show you how to do that in a second. There's also going to be like a node removal callback that's going to you know C doesn't really have any explicit memory management other than what you do with it so this you know this library that I found has a handy thing where you can provide a callback so that when you remove an object from your tree you can let Python know that there's one less reference to it so I'll go over how to deal with the callbacks in a sec and finally oh yeah those callbacks will be defined by like type defs and these can actually go in your C header string in the build file in the CFFI build file so you know this type of a comparison is it takes two nodes and returns like minus one zero or one and the node removal callback doesn't return anything it just takes like a tree in a node and so that's pretty handy so to build a Python API we just want a class that represents the red-black tree it's pretty standard it's going to in its init create so this ffi.new handle gives you an opaque C pointer to the Python object and you can actually like interconvert between like one of the things CFFI does is allows you to like convert or take Python objects and get like opaque pointers to them and then convert those back into the Python objects you know if you get them back out of C-Land we're going to do RB tree create so create a new red-black tree and we're going to pass this I'll go over this in a second but some kind of comparison callback that tells the C data structure how to compare to Python objects and we're going to give it this opaque pointer to the Python object so that in our callbacks that give you that pointer to the tree we can go back and find the RB tree like Python object and it's just helpful to be able to associate all these things in Python land with all these things in C-Land and also so pretty important thing is we want to set up a set that refers to all the objects contained in the tree and the reason why is we can't do like manual we're not doing manual Python reference counting on the objects that we put in the tree so once we put a Python object into this tree Python's garbage collector needs to know that tree has a reference to the object and because there's nothing like in C that is doing that sort of explicitly for us and talking to the Python interpreter the easiest way is just to create a standard Python set of everything you put in the tree and make sure that you add and remove things to this set as you put things in and take things out of the tree so that leads us to our add method which as you can see it just adds one thing to the tree so it gets a handle on the object it adds that handle to the set of tracked objects and then it finally calls live RB tree insert on the tree with the handle to that object and that is just going to pass this sort of opaque pointer into the RB tree insert function in C the comparison callback will be used to find that object's correct place in the tree so we'll go over how you come up with what to put in here in a second and similarly remove is going to get a handle to the object it's going to call live RB tree remove on the tree this is that remember this is the pointer to the like RB tree struct from C with the handle to the object and some kind of remove callback that is going to like prune it from the tree prune it from that set of objects if it was actually removed from the tree and if you want you can use the result of this to throw an error if there was no such object in the tree I chose not to because set.remove doesn't throw an error if the thing is not in the set so and finally again like another simple method is contains so just you can get a handle on the object and then just to live RB tree find in the tree with the handles the object that's going to return that's going to return null if the object wasn't found in the tree so again we can just implement all these Python magic methods and have a nice Python API by calling down to the like the C functions. So now I'm just going to talk really quickly about how to implement that callback that we spoke about for comparing things. So you're going to define another function in your header file in the build file the C header string. It's RB tree node compare and it takes tree struct and the two nodes and this external Python tells CFFI when you build the C module don't expect to find this function in the C source code it's going to come from Python so that's what the external Python means there. And then in our Python library we'll do at ffi.def externs remember this ffi is the object that comes out of the C module. I'm sorry. We'll define a function whose name matches that name we put in the header that just compares two nodes and it's pretty simple we can get the two nodes use ffi.fromhandle to get the Python objects that those are the values and those nodes refer to and we can just do a simple comparison if A is equal to B it returns zero if A is less than B minus one so this is like a the CMP function in Python is actually the same logic. So that's relatively simple and then finally when we create the tree that callback remember we were passing like a comparison callback the value of that is going to be lib dot and then that name of that function that we defined in the header so to sort of tie it all together it's like that name RbTreeNodeCompare is the name of this function that we're def externing and then it's the name of this thing that we're exporting on the lib. And for node removal it's very much the same we're going to have an RbTreeNode was removed you know like a callback that represents oh a node got removed from the tree the callback is defined in Python it's going to get the tree it's going to get the value and it's going to remove the value from that set of tracked objects and then finally our remove method on the tree is going to pass in lib.RbTreeNode was removed as the removal callback so so that's how you sort of do you implement C callbacks in Python and see if if I will sort of translate things back and forth pretty much automatically for you so I just want to do a really quick wrap up of how this all worked we've got our build file our setup file a little Python library that sort of wraps everything and the actual like C source code itself that we want to compile that's going to provide the definitions of the implementations of things defined in the build file we do Python RbTreeBuild.py to actually like check that our C is all compiling correctly Python setup.py develop if you're not aware of this come on this is amazing if you're working on a setup tools installable module and you want to be able to make changes without having to do a full Python setup.py install every time Python setup.py develop like does the whole setup tools build thing and then it just links links the folder you're actively working on into your path so that every time you import it you get the latest thing so Python setup.py develop is a really great come on teams at that point RbTree is available in Python we can create the tree object we can add elements to it you know two twelve five seven get its length by call that's going to be calling into C for all of this and check you know two is in the tree yes nine is in the tree no and finally when we iterate through the tree with again defining an iteration method that uses all the C functions that defined in that library you can actually get those values back in order and so you have a fast set type that provides in order iteration of the elements and you know unlike pythons set and it's calling into C so it's going to be really quick and it's going to work on Python two and Python three and C Python and PyPy and isn't that nice and so I want to thank you again for listening to this whirlwind tour of how to wrap a library with CFFI if you have any questions just come up to me if you see me and I've linked the CFFI documentation at readthedocs and also bit.ly slash cffi-example is a fully fleshed out implementation of this library that you can set up that you can pip install right now if you want to and it's all on GitHub for you to browse so thank you thanks Zach thank you so as a as a token of our