 We're going to talk about C++ extensions and C extension a little bit too, Groovy, Charles Cornell. I've been writing Groovy apps for about five years. Got slightly less time in web apps, but most of my experience is writing Groovy applications to automate tasks in semi-conductors, which is my background. So what got me interested in both in C++ and Groovy had some performance problems getting data into and out of a SQLite database, and so I had a C++ wrapper around that that I did to kind of get past the holdup. And I wanted to use it everywhere. I had a standalone C++ app that called that library and then I had a Groovy app that I wanted to call it from and I also had an Octave application as well. And if you don't know, Octave is an open source MATLAB code that they use a lot in the SQL process and then come up with it. And this is just some stuff that I picked up along the way. So kind of the first question is, when would you want to use a C++ or C extension? And the obvious answer is performance. There's a couple of other times, like if there's already a C library out there that does what you need, why are you using the wheel? And the other option is, which was the case I had, I had a C++ library, but I wanted to call it from multiple scripting languages. If I had implemented that in Groovy instead, that would have been more difficult. So it's important to know which one of these are the requirements for you so that you select the correct tool to build your extension. So if you're going for performance, if that's the driver for you, what can you expect? Well, like in any code, if you make bad choices about how you architect it, maybe not very much. Probably a lot less than you were hoping for. That's why, you know, when you start, you want to make sure you understand exactly what is slow because that affects what you push into the C extension and how you define your interface. The definition of the interface can have a significant impact on the performance of that extension. And just an example, like if, you know, say you have a loop that you go through a bunch of times and it's taken a while, you know, you want to make sure you port enough of the loop. What you port to your extension needs to take enough time so that the overhead of going from Ruby to C and back to Ruby isn't a significant portion of the time spent in C. And I've seen performance improvements of anywhere from, you know, 3 to 4X to over 100. So simple advice for doing started, you know, whiskey does not hurt at all. It's, you know, a really good way to start your endeavor there. A couple of things to keep in mind is if you build a traditional C or C++ extension it is going to depend on the Ruby interpreter you use. So it's not going to work with J-Ruby. It won't work with convenience. And even there are differences in the version of Ruby as well. Though those are typically fairly minor, but they do require a recap of them. You're going to have some linker errors, you know, just kind of get your hands dirty, get over it, you know, they're not that bad. And then understand your performance needs and what interpreter you're going to use so that you select the best tool for the job. And so with that let's go through some of the tools that I came across when I was looking to do this. So FFI is, it's been around for a really long time but it's actually fairly new to Ruby. It stands for Foreign Function Interface and it's just a way in pure Ruby you can call a C function, an arbitrary C function from any shared library all at runtime. There's no compile, you don't have to set up anything. You just, in Ruby you define the interface and then you can call the C function that you need. And it's becoming fairly sophisticated and for C only it's a pretty good choice and it has some advantages. So the other one is Rice. It's basically a C++ DSL for interfacing with the Ruby C thing. So, you know, if you're writing C++ then you can define your interface in C++ and that's what Rice let you do. RB++ is a tool that sits on top of Rice and it basically writes the Rice code part. So we're going to talk about that. And then Swig, it's kind of probably the one you'd think of. First maybe it's been around for longest and it's advantages. It supports multiple languages. That's why we're going to talk about it more later. So, FFI, you know, I try to be clever to find some pictures and stuff. It's really hard to find a picture for FFI so I had to stoop to the French forces of the interior which was a resistance group at World War II but they came to my rescue. Anyway, like I said, FFI is a clean solution for C libraries. It's not good for C++. It doesn't reflect classes. And the C++ name manual makes FFI a system. It's not really applicable for that. It's Ruby only but independent of the interpreter, like I said. And it does have some additional overhead versus traditional C extension. But if you just want access to the existing C library then it's probably the least work to do. So, I mean, it's a really exciting solution and it just didn't do exactly what I needed. So, Rice is the other tool. Rice, Bill and Bolly, if anybody cares. So, like I said, Rice is a C++ DSL interface with Ruby C. It's not generic and interface generator so it is Ruby only. But, you know, if you like C++ then it's probably a pretty good choice because then you can do everything for your extension in C++. You don't have this separate Swig language or something like that. And RB++, it makes Rice more like Swig and it uses something called GCCXML which is an XML description of your C++ library and then it writes Rice code for you and then compiles your extension to build that interface. And I have not used this one. I just looked into it as one of the possible choices but it seemed kind of interesting under certain cases. So, Swig, turns out Swig is a club in San Francisco. So, kind of cool. So, Swig, the way Swig works is Swig stands for simplified wrapper and interface generator and so it parses your C++ header files and it understands nearly all of the C++ syntax and then it writes the wrapper code for you to build the Ruby classes and the methods and all that and then it does reflect the C++ hierarchy and supports multiple languages listed here and I highlighted a couple just to point out a few things. Python is highlighted because that's where Swig came from. The guy that started Swig was a Python developer and I don't know when he started that. It was like in 92 or something. And then Ruby, obviously, and the support for Ruby is actually pretty good. It's not bad. And then Octave as well, and that was what interested me was I wanted to support Ruby and Octave with my extension. So, let's look at just a quick little example to gauge kind of where you can gain performance and how much. So, this is a little test I ran. It's 100,000 inserts into a SQLite database and the top two lines is to make a 100,000 element array with the rows you're going to insert and in the middle section here, we open a transaction, prepare an SQL statement and then we loop over our data array and execute the SQL and inject the 100,000 statements. And then here in the C++ extension, I just made a function where you give it the SQL code and you pass in the full array. So, basically, I moved that looping into C++ and that's all I did. And by doing that, the SQLite way took 9.8 seconds and in C++ it took 2.6. So, it was a 3.7x speed up and all I did was move a loop into C++. Now, why did that happen? It's because of the overhead of going from Ruby through the Ruby methods, through the wrapper down in the C++ and back up with all the type conversions and everything. If you pass the whole array in, then that doesn't happen, right? So, small stuff can make a difference when you're in that type interrelation. All right, so, when you're building a SWIG module, kind of what happens? I just wanted to throw this up for a little information. You just run the SWIG command and you tell it, hey, this is going to be C++, not C. Ruby, and you pass it the interface file and it's going to make a C++ file that has the wrapper description and the interface description in there. Compile that into a .o, compile that into a shared library, and then they require, you know, CMC underscore dbi and it'll load it up. So, what does the SWIG interface file look like? Well, it's actually relatively simple. If you want to do simple stuff, it's pretty easy to do that in SWIG. That's one of its positives. So, you just define the module name and then it has some kind of CAD utilities that provide stuff you could write on your own, but they've already written it for common stuff like C++ string type maps to convert C++ strings to Ruby strings, and then the second include is the C++ standard tip with library that has a vector container, and so that has the type map so that you can pass that in and out of C++ functions. And then you just, this include, you're just including the header for the file that you want to make visible. It has the class hierarchy and you want to make visible on C++ so it's just what you wrote. And then this is copied verbatim into the C file and it's where you put stuff that's required to compile the file. And the reason they have two sections is sometimes you have stuff required to compile it that you don't want to make visible in the scripting language. So that's what the two sections are for. All right, so, as I said, it does come with pretty full feature STL support. It has built in type maps for all the STL containers. Those type maps support passing containers in as arguments to functions and returning those as values from functions. And it makes wrapper objects around those to support that and they have rubies interfaces that use the term somewhat loosely. So let's just take vector as an example and if I want to pass vector as an input argument to the function in SWIG, I would add this line to my SWIG interface file and what that's going to do is I'm telling it, hey, I want to use a vector of integers as an argument and I'm going to name that int vector. So in Ruby, the type name is int vector. And then here's the function that you would call and all it does is loop over the vector and pronounce the elements. And then on the Ruby side, you could say something like hey, I want a new int vector, push three ints on there and then pass that to the function and then it would pronounce the elements. So if we take a look at our new int vector in IRB just to see what it looks like, you can create a new one and initialize it with some elements that way and then it's going to return this hieroglyphic looking thing that's a standard vector of int, a basic C++ class so it makes it kind of easy to see what SWIG did in that case. And then you can use it in many ways just like a Ruby array can loop over it and pronounce the elements. You can ask it or you can push a new element on. If you try and push something that's not an int it will raise an exception that it's like a SWIG type error or something so you do know so it enforces that because it really is not a Ruby array. You can ask it how big it is. You can ask it how much memory it's allocated. This is directly from the C++ STL and it will be different typically. And then you can call .join I have no love because it's not really a Ruby array so I have to stick with the basic functionality. So the other thing that you do in SWIG that I want to talk a little bit about is exception entity because you can throw exceptions in C++ and you can throw exceptions in Ruby which would be great if when I threw it in my C++ extension it looked in Ruby just like it came from Ruby and I can handle it the same way. Well, this crazy looking code is what does that and this is copied verbatim out of the SWIG help manual and this particular method works for any scripting language. All you're doing is you start with this one and catch anything derived from the base exception class in C++ and they just set the message and it will show up as a Ruby runtime of exception in the script. So that works great, right? If you put that in, you can be covered and you can catch all your exceptions and that's not bad. But, you know, sometimes you might want to throw different exceptions for different things that go wrong like you might have a SQLite3 statement error or SQL error or database lock or something like that. So to make that happen the solution I came up with was I added another section in here where I catch a base class that I define and then I define a Ruby class that's the name of whatever class I brew. It's a child of runtime error and then I raise it with the what message from the C++ exception. So the C++ code to support that I define my error base class I inherit from the C++ built-in exception class and then I define a foobar error from that and the only thing special about it really the only reason I'm doing that is so that I can set the name of the class that it's going to make a group called C++ stinks and there's no way to get the class name within that class in C++ it's remarkable how difficult they make that name. Anyway, there are some compiler and platform specific ways of getting that but they all give a different answer like if anything changes so it's not the best I could come up. Alright, so why did I go with SWIG? Really the main reason is it's the only one that would let me build an extension for Ruby and for Octave without repeating myself and so it was pretty attractive for that. One of the other advantages is it does let you automate building the interface so in other words the SWIG interface file, once that's set up for the data types that you're going to be using within your app if you go to find a bunch of new methods you don't have to go read this thing you just write the C++ like you know you have to and then you just re-run SWIG and read the file and then new methods get reflected into the Ruby extension automatically and that was the advantage and that's it Any questions about SWIG or hopefully not Rice? Does it have like one of the things I've noticed is when I've got Ruby code called C++ code called Ruby code if I throw an exception to Ruby it will actually skip over those C++ frames I don't know if it uses go to or whatever or something happens that like the structure is never in call and I was wondering if SWIG has some sort of mechanism that you could do like catch and re-throw like in each language boundary to try and make sure that you don't have Yeah, that's a good question Demi, did you use that? I'm not used to SWIG right now so I'm worried that something could help me out but I have to manually make sure that every time there's a boundary Yes, I probably don't know the answer because I never did that I didn't go Ruby to C++ and have that call Ruby I know that SWIG has a lot of so the good thing about SWIG is it's very flexible and the bad thing about SWIG is it's very flexible so once you kind of go off the beaten path kind of get down in there it could be a little quirky of how you do stuff but like this gets put in the wrapper for every method it's like it gets injected so there may well be a way that you can inject something because that's kind of how it works it's like, hey, when you see this happen I need to do this type conversion use this code or when every wrapper injects this code so I don't know for sure but when it does have white a lot of options like that So you handle the construction I mean SWIG handles the construction the construction of the objects or Ruby does that? SWIG writes the C code Ruby C code that constructs the class hierarchy on the Ruby side so if you go look at the bottom of the SWIG file they don't have a bunch of defined class and then defined all the methods and everything and then above that it has all the C wrapper code to do the type conversion but yes, it will build all of that automatically Okay, but it doesn't do the instantiation that's Ruby still and Ruby garbage collector it has methods that let you affect how you want to handle garbage collection like you can register objects in SWIG with the garbage collector and then it knows to call the constructor or you can tell it I mean you just don't register because you don't want it to do anything but you do have hooks to do that I haven't done a lot of that but I think they are Okay So does it allocate does it allocate to the C++ objects on the stack or can you control that? Yeah, so you mean like when I do when you say inVector.U where does that does it call now like or does it call new I guess this is kind of what you were talking about it does, it does call the C++ constructor yes, it does do that but when we do that it will be registered with garbage collection so when it goes out of scope and Ruby it will call the constructor because I'm thinking this would be great for a game if you have just a huge array of vectors that you have to want to process in C++ you have to allocate a lot of memory and then I guess in C++ then you would deconstruct them all or whatever you need to do I guess Yeah, well that's why it has the wrapper around the InVector so you don't have to you don't have to build the array in Ruby and then copy it in C++ that's a new strategy you're never going to get any speed up that way so yeah, if you do it kind of a swig way you won't make a new copy of it in C++ but you'll be stuck using the STL vector you won't have the full Ruby stuff but if you're just building up an array that's not too bad it does have some stuff to keep from happening that's the other advantage of it over something like ffi because ffi will have more overhead because it has more steps in that wrapper process it doesn't necessarily make copies of stuff but that's why it does have more overhead in traditional C++ What is it supposed to do? I guess about Yeah, we got time for that Anybody has one? Alright, cool. Thank you