 Hi, my name is Aurélien Aptel and I've been working on making Emacs extendable in C and other languages So I was afraid this talk wouldn't be accepted at first because The dev room is mostly Java stuff. So it's kind of far away from Emacs I wouldn't agree with this No offense So GNU Emacs You might have heard of It I'm guessing if you're in this room you already know a bit about it. So I'll go quick over this So it's an editor of the Emacs family Has been written in 1976. So it's technically 42 years old, which is a pretty cool age, I guess It's still a popular although it's certainly less popular than it was I guess 20 20 years ago or something So it's extensible It's one of the first editor to be extensible in a proper language again. No offense These are just jokes don't take this too seriously and It self-documented I was one of the first as well Which means every time you extend something you can put a dog string Similar to Python if you're familiar with this and you can access the documentation from within Emacs and Everything is you can do everything from Emacs documented extensive Yeah, expand it. So it's Emacs is also intertwined with the hacker culture so Lisp was extremely popular in the MIT AI lab and So you had they even developed this machines there where the processor would actually Have instructions to deal with the consels and such So it's a fine fascinating stuff. I've put a couple of links here you can explore if you want to dig into it Did you know Stallman had a published ACM article about Emacs I Didn't so that's interesting There's a jammy Zawinski who made made the tab line about the different families of Emacs that's out there Recently the Stefan Monnier the current maintenance of Emacs wrote a like an article about how the language evolved and How it got all the hits features So plenty of material to go through so back to the editor So Emacs is extensible in Emacs Lisp It's it's a nice Sort of nice language although many people familiar with lists would say it's so very well designed it kind of grew organically and Compared to other lists. It's maybe not the most practical to use So Emacs has a bytecode compiler within Emacs Written in Emacs Lisp and it has a VM so basically max is An interpreter for Lisp which also you know features Editing commands and such So it's still no not very fast Some people have tried to make it fast faster by So they made the bytecode compiler at the moment. There's an experimental JIT branch JIT Haven't tried it Yet, but it looks promising. I've heard it could speed up by twice some use case I know so Tom show me with room might be in this room. I don't know if you're here Hi Has worked on a Lisp compiler that would target LLVM So the compiler itself is written in Emacs Lisp, which is an interesting choice He also he also worked on the Emacs FFI which I'll talk about later So if you want to extend Emacs, there's a couple of ways to interact with the rest of your system So initially you add files. It's a text editor. So files are common way to exchange data from one program to the other Eventually it got processes so you can start and interact with process interactively so the way it works in Emacs is You associate a process in a buffer and so every time this process would write stuff It gets appended to the buffer and you can set up callbacks to process the output They're called inferior processes in Emacs jargon You also got TCP and UDP sockets so you can technically make a full server and clients within in Emacs I know a guy on the mailing list Nick Ferrier. I think he made a full web server in Emacs He has a whole web page about it. It's interesting again So we also have debuts But I don't know much about this so cannot really go in there So all of those methods are not always convenient because at some point you always need to serialize your data structure from Lisp to whatever Program you want to communicate with And you have to do it in both directions So most common solutions to do this inter process Communication is either you write your whole program in pure Emacs Lisp Which means yeah, no basically no inter process communication just everything in Emacs Alternatively you write a separate program and then you call it from Emacs So you can do the heavy lifting in your external program and then interact with it from Emacs And there's a there's also another way to do things some some extensions they have There are so they have a separate program, but it's a server and so you can keep state between calls from Emacs By keeping it in the server so The next way to go about it is to have a native API to interact with The system at the most lowest level so a C API So this is not new some people have tried to do this for a while Steve camp in 2000 allowed to have C defense so defense is The way in Emacs core when you implement a new function in C. There's a macro called different and Defines some stuff for you and it will it ends up make making a new in max list function So you he made a patch that would allow for her to load dynamically See defense like this way But he was never merged for various reasons One of them I'll go later in into Unread it someone told me actually the actual rainy Ruben. I didn't know this guy, but he told me Shortly after this he made another improvement on it and tried to send it but never got merged either But it ended up in X. Emacs, which is a fork of Emacs, which some of you might remember from back in the day 2006 they've loved right again same idea, but this time he used the lip tool for the dynamic loading This is a new tool and set of libraries that would that makes it easy to dynamically load Decolon of DLLs or shared objects on it So this this API works on every platform. So this was more portable than the other attempt But again never merged. So why is this never merged? Well, mostly is because The people maintaining Emacs which were mostly Richard Tolman at the time We're afraid that people would ship free Emacs along with already built Shared libraries which could potentially not be free So he was afraid people make bundles of Emacs with non-free software So he's pretty anal about this as you might know But this this problem has been existing for a while in other projects GCC For long for the longest time also had issues with this. They wanted plugins, dynamic plugins, but they never reach any traction because of this GPL problem But sometime in 2009 they settled on us on a compromise and Discompromises the plugin has to have this symbol which says the name of the symbol is literally plugins GPL compatible So you might ask how is this? enough to enforce Laws and stuff like this. I'm not a lawyer. So I can't really explain the details, but So the new standard coding standard has this quote So by by adding this check to your program you are not creating a new legal requirement the GPL itself requires plugins to be free software License Compatibly the GPL and a GPL already requires those plugins to be released under a compatible license The symbol definition in the plugin makes it harder for anyone who might distribute better plugins to legally defend themselves if a case against If a case about this code to court we can point to that symbol as evidence that the plugin developer understood that that the license had this requirement So yeah, it doesn't really enforce anything, but it's enough to sue. I guess so that was the work around So in 2014 I saw in my own attempt. I learned about this work around pretty late I thought it would be a good way to extend the emacs even more So I made it a little based again. It was linking against the emacs binary Which is not something you always want apparently I'll go on that later So it was kind of the same as before it allowed you to write city funds outside of emacs and load them dynamically I send it on the mailing list and it was received positively, but people wanted more There was some iterations on it So most people don't like when their editor crashes surprising thing, right? so people wanted something more robust because the way he was done basically you had to know the The internal data structures of emacs to interact with it and so After an update if you load the same dynamic plugin which was written for the previous version of emacs Some fields might have been added to structures and such in a way that would just make emacs crash and This is really not something you want for in text editor so the next situation is It's very similar in the design to the GNI which Java people might know It's the Java native interface is basically what Java uses to do the same thing So this was I had a lot of help on this people from Big companies actually so Daniel Colation at least at the time is what was working at Facebook and Philip Stephanie is at Google So a lot of emacs users still out there So we implemented this and it's the design is a lot different as you see so after more iterations it was finally emerged in emacs 25 and It was basically two years of on-and-off work not full-time A lot of reviews and bike shading on the mailing list as is always the case with emacs people complaining about small stuff You have to redo things So how does it work? So first of all you have to build your emacs with When you configure you have to pass this flag it's not enabled by default yet You to pass dash dash with module when you run configure. So how does it work? So in this version you don't link against the emacs binary you There's a header file you just need to include and Define the set of structures and function pointers by which you interact with the emacs list VM in a proper API So it doesn't break after emacs releases So all the function the C function you want to expose have to have this prototype So you have You have the emacs environments structure which I'll go in detail after afterwards the number of arguments an array of emacs list values and some user provided the pointer you can choose and So the environment provides you with the function pointers to interact with the VM So this one would convert a C int into an emacs list Value of ints so the function returns an emacs value, which is a no-pack type. I'll go on that later So this is basically what a emacs list C function looks like so in order to make this function callable or callable from list you You use the environment pointer. There's a function pointer in there called make function You pass it the number of arguments the minimum number of arguments the maximum This is the func the a pointer to the function you defined earlier the C stuff Dock string here, and this is the void star, which was the last parameter of the function Here okay the void here So a list is a list to which means every symbol has two cells So it's like having two namespaces for variables and functions so When you do a defun in illisp where you say The function foo is defined to this code What I what it actually does in the background is setting the function cell of the symbol foo to this lambda so That's basically what you have to do using the API so Fun call is the function pointer to make a function call in the illisp VM You have to pass a symbol To to to the fset call This is done via the intern function which converts the string to a symbol so you pass equal fset. There's two arguments The first one is the name of the function You want to bind to and the second one is the actual lambda you want to bind into so this is exactly Yeah, fset foo Okay, so If you put it all together, this is basically what your plug-in would look like You have the plug-in the mandatory symbol to say your plug-in is gpl compatible You call your Mac function you pass all the arguments and you bind it to the name you want So in this case this function will be callable through the name my mod dash test In illisp and that's it So You would compile away I forgot to mention something emacs module in it is the function that Emacs will call when it loads your plug-in so everything that needs to be initialized is done through here It's the entry point of your module basically So to compile it's pretty standard if you're familiar with the shared libraries on Linux so this actually works as On Windows as well. You just need to call the equivalent, but it should work as well so first step is you turn your C into a Object code you want to pass this position independent flag So that it works no matter where it's loaded in the emacs binary Then you turn it into a shared object. This is on this way. It's just magic incantation So Within the API how does memory management works so every time C list function is called all the stuff you allocate in it is Automatically freed when the function returns so you don't really have to worry about memory management within the function You can mark values as global their reference counted so if you need to Use the same values in two different function calls You can do so by making them global so in this example. I turned the symbol t which is true in it makes list I store it in this global variable here and in this other function I can just reuse it without having to intern it again So it just saves a couple of calls this way So yeah unless you need the global references you don't need to worry about my memory management You can only use the Emacs values you allocated within emacs see list functions like you cannot have a thread That would access those values outside of when you call them from emacs. That's one of the limitation Yeah, so usually you would do this to we have to cache values because you don't to Recompute them every time you run a function So error handling Emacs list has Signal but see doesn't so somehow you have to convert so signals work kind of like exceptions in other languages And so somehow see doesn't have them and you have to convert Yeah, those two different mindsets of dealing with it So the way we implemented it in emacs is if a function gets signaled when you call it In the environment pointer, there's a flag that says You can check for it every time you call a function that says whether or not it exited The signal or if it returned normally or if it used the throw thing You access signal and throw So it's similar to error no in C. It's every time there's an error in Lipsy function calls the Lipsy function would set error no to a certain value and you have to check it afterwards So we have similar you can clear the flag by calling non-local exit clear Yeah, so there's two ways to go about it either you check every single API call or you only check after important ones because The API every time you do a call it check if there's already the flag sets And if it is it just doesn't do anything and fails automatically So you don't have to actually check every call if you know what you're doing But it's very verbose to check all the time. So that's an issue That's all it is if you want to go from exceptions to regular error codes so I also had to add a New emacs list object type so you can wrap any kind of pointers so Those function in the API would allow you to if you use a library that makes It provides a handle of when you open a resource you can store that handle inside a user pointer this way so you have a function to make it to get and set it and They have There's a finalizer to it so that when the object is garbage collected You can set a function that will be called when it when it does so you don't have leaks this way every leaks So I have a demo for you I have a really simple Module can everybody read fine. So in this In this directory of a max configured with With modules you have to trust me on this I won't compile it again. It's kind of long so and in this folder I have my mod.c So this is the header that has all the API defined You have the monetary symbol You have the c function here, which we want to expose to emacs. It just returns 42 This is the bind thing I showed earlier so it sets the function cell of the symbol So you want to set you have a function you want to Be able to call it by the name foo you would pass foo here and here you would pass the lambda Function that we want to call so You enter an f set you enter in the name of the function and then you just make the call by doing a phone call so usually Emacs packages they have this Way to say They're already loaded and so you call provide at the end of your illness packages usually so here I just do the same thing This is equivalent to calling provide and then the symbol Past as argument So yeah, this is basically what I showed earlier you make a lambda that has zero arguments You make it call this c function you pass it this doc string and The arbitrary pointer I just passed null because I don't really use it this time I Called bind function to make this function callable by this name and then I say This module is loaded under the name my mod basically and then I return zero and this function gets called the moment I load it in Emacs, so I Just compiled a module you can see the comments being run here So in order to be able to load the module I just built I have to add it to To the path which is searched through for modules. It's called load pass Now I just require it So this is a symbol when I put provide this is what the symbol I used and so this is what I would require So it has been successfully loaded now. I can call And it just returned 42 There's a couple of other sites Don't leave yet so the title of the talk was You can also extend it using other languages so as long as your language can I'm asking any valid modules. So any shared object which follows the API we saw can be loaded so if your language can compile to a shared object and Manipulate pointers in one way or another that is probably usable. So people have been writing like shim layers to to be able to Write modules in other languages. So I know there's one in rust Oh camel Go name and probably others So I haven't tried them, but they're there so feel free to try them out There's an extensive documentation written by Philip Stephanie There's a link there you can check it covers many cases a corner cases and questions you might have as well Chris one has a blog where he also experimented with it you made a simple C module and He also has a post about how he managed to use signals unique signals so that He made a module with a thread that does certain things pulling and other things and every time this one wants to signal signal back to the max process It sends a it sends a unique signal and it's a nice way to have async request this way otherwise with the pure API you you can only Call your module from me max. There's no way to independently make Your module called back to me max There's another guy who made plenty of modules. He never talked to me, but he made a lot of stuff on github And I'll go very quick so there's already a bunch of existing modules mostly library bindings So there's one for a school leads C sound the pen is a cell Jitson parsers and other things some people have even embedded interpreters of other languages in a module so you can run Python any max Yeah, there's or ruby or Lua or whatever So the next step would be to write a foreign function interface Which would allow so at the moment you have to write C or something else as a Intermediary layer to actually access your library So another way to go about it would be to fully stay in a max lisp and load any library not specifically a module and so Tom Tommy has done some work he made a mod it's actually a module that Allows you to write least to just load any library and he hasn't implemented the GPL check so it will probably never get merged So yeah, that's it