 It's called, not for movies, date and date time performance out of the park with Home Run. My name is Jeremy Evans. I wrote Home Run because of the performance issues in the Ruby Standard Library. Basically, if you were trying to use the date and date time classes in performance-sensitive code, it's pretty much a swing and a miss. Now, Home Run basically destroys the performance in the Standard Library, especially on older versions of the Ruby. I'm going to talk about how this a little bit later, but I think it's first appropriate to talk about the history of the Standard Date Library in order to understand why this Ruby 1.0 is. So let's take a trip in the time chain way, way back to Ruby 1.0 in 1998. And back to Ruby 1.0, pay.rb was a fairly simple 228-line file written by Yasuo Oba and it stored dates as year, month, and day integers. Now, this code is a simplified version of what was in Ruby 1.0. Now, I think that other than one as the choice of the default for the year, it's not bad. Now, the date in our early file for the 1.0 can actually be run unmodified on Ruby 1.8. And Ruby continued to use a modified version of this 1.0 code until Ruby 1.6.0 was released. Now, Ruby 1.6.0 included a new version of the Standard Date Library written by Taito Shifunaba, and this is the ancestral version of the current Standard Date Library. It stored the date as a Julian Day integer and supported a modifiable native calendar of the month where Ruby 1.0 had a static-name calendar of the month. Now, like 3rd and 1.0 code, 3rd and 1.6.0 code can also be run unmodified on Ruby 1.8, so it is about two-thirds of the speed of the 1.0 code. Now, both the 1.0 code and the 1.6.0 code only handle dates, they do not handle date times. Support for date time was not added until Ruby 1.8.0. At every Ruby 1.8.0, the date and date time classes were pretty much the same. In order to handle fractional days, the date was now sort of irrational and it recorded a time zone offset, even though the dates themselves don't have fractional components or specific time zones. So basically, Ruby 1.8 and Ruby 1.9 date objects do not really represent dates. They represent date times, usually at midnight in the TC. Now, the 1.8.0 version of date was about three times slower than the 1.6.0 version and four times slower than the 1.0 version. So from 1.0 to 1.6.0 to 1.8.0, date got slower, and since then, date has gotten a little bit faster. For example, in Ruby 1.8.7, instantiating date objects is about 46% faster than the 1.8.0. However, it's still half the speed of 1.6.0 and a third of the speed of 1.8.0. Now, in Ruby 1.9.2, the code looks and becomes pretty much the same way as it does in the 1.8.7. And by that, I mean, 1 will be 1.9 is much faster than 1.8. If you take the 1.9.2 date code and you run it on 1.8.7, it's about the same speed as the 1.8.7 date code. So let's shift gears for a second. I mean, you're assuming you're writing some code and you want to speed it up. It's crawling and you cannot figure out why. You want it to go faster, but even after optimizing some of your application code, it can still not go as fast as you think it should. Well, first, you fail. You're not profiling first. And then when you decide to profile, you see that your application is not spending much time on your own code. It's spending almost all of its time instantiating date objects. So now instead of you failing, it's Ruby failing you. And there's not much you can do. You need to pass a date object to another API, and all you're doing is giving a year a month a day integer and a date one, and you can't fathom why it's taking so long. So this basically explains your feelings at that point. You feel helpless and you have to pass in a date object, and it does not appear to be an easy way to create one, or a fast way to create one. Because at this point you have three options. You can give up and resign yourself to slow code. And I expect that this is the path that most projects take. You can attempt to work around dates performance issues by reinventing ports of date and rational in order to speed things up. For example, let's take everyone's favorite Ruby ORN, Data Matterer. Let's take the data object adapters that run underneath it. If you peek under the lid, you'll see that most of the data object adapters re-implement ports of date and rational in C in order to improve performance. So these are the prototypes for the C functions that Postgres data object adapter uses, and most of the other data object adapters are similar. Now I positive that many of these methods should not need to exist. There's no reason a date time or a database library should have to write their own date and time calculation functions for functions to reduce rational numbers. Now these functions probably should exist, but they should be small with simple implementations that either create a Ruby string and pass it to one of the data parsing functions who parse this for themselves and call one of the standard instructors. Now unfortunately, in our quest for decent performance, these methods are either small or simple. Now this code is taken from the parse date time function. It attempts to do a greatest common denominator calculation in C and call dates new bang method with a pre-computed astronomical Julian date because living in the standard library due to calculation is a lot slower. If the standard date library had decent performance, you would not need all this code. So these are methods of the only database library that tries to work around date shortcomings. Swift is a fairly new database library, and because date is so slow, they skip long dates constructors entirely. Typecast timestamp here, parsing this string in C, and creates a Ruby time object. Then to date is called that time object in order to get the date. And Swift does this because it's fast to create a Ruby time object and convert it to a date using to date that it is to create a date object the standard way. C libraries are the most common places you'll see the overriding the date-to-date time code. I think some pure Ruby libraries do so. And this example is taken from Ruby OLE, which is a Ruby library, from dealing with those ugly Microsoft formats like the free Office 2007 Word documents. Ruby OLE is a final-time class which is a subclass of date time. And there are a couple things to note here. First is the comment right at the top which says, date time by being slow, faster version for final time. Second is the middle block where we re-implements the join date calculation. And the reason this is faster than the standard library in the calculation is that it uses a float instead of a ration to store the date. Now there are probably other examples of specific libraries that re-implement parts of date performance. Date performance was the first general library I'm aware of that attempted to improve date performance. It was written in early 2007 by Ryan Tomeko and it overrides few dates internal methods in C in order to improve performance. They've been creating dates about five times faster and it also overrode the string f time and string p time to use the C-level APIs directly if possible. And in the cases where the C-level APIs could handle it, more I think using string f time is about 12 times faster and parsing with string p time is about 45 times faster. And because they kept the same data structure it just re-implemented algorithms in C there weren't many significant compatibility issues. However, date performance only solved parts of dates performance problems. It didn't overwrite any of the date time methods so date time did not get significantly faster. It didn't speed up dates of course at all and if the C string f time and string p time functions couldn't handle the format strings used it was not significantly faster than that. One of the first two options you can take if you want to improve slow date code. And the third option you can take is to write your own date and date time classes. Now my first attempt at this was called third base. It was written in late 2008 and it was a peer review implementation. And it was about four times faster of creating dates and about three to twelve times faster at parsing and it had a lower memory footprint as well. It also supported plug hole parsers so you could write your own date parser and plug it into date.pars. Now I used third base in production after creating it but I never saw a wide spread use. It had some significant compatibility issues and it just wasn't fast enough to justify them for most people. Anyway, since creating third base in 2008 I gave more experience for rooming in C. And in July of this year I was itching to write a non-trivial Ruby C extension and decided to give a shot at writing an ACAC compatible replacement for Ruby's day class in C. And because I made my previous library third base and had high hopes that the C library would be better I called it home run. So before I jump into an explanation about why home run is so much faster than the standard library let's take a step back and look at the big picture to determine the reasons why the standard date library is slow. Now there's a saying that computer science is about two things. The first is algorithms and the second is data structures. And it's my belief that the number one reason of why an extension in date objects is slow is because of the choice of data structure to use to store it. So the standard date library stores both dates and date times as three separate pieces of data. The first is the astronomical Julian date. Now there are two reasons why using the astronomical Julian date is slow. The first is that most users are going to be creating dates using the standard Gregorian calendar for six months and days so you need to do a fairly expensive conversion calculation just to instantiate the object. The second issue is that Ruby does not have a fast arbitrary precision number class. A float does not have high enough precision to store the astronomical Julian date as it is not able to get nanosecond level or is limited for times in any recent date and dates far in the future would have less precision. Big decimal would be a possibility and on Ruby 1.8 it's actually about just faster than rational. On Ruby 1.9 rational is implemented in C and it's about two and a half times faster than big decimal. And in any case the choice was made to store the astronomical Julian date as a rational. And incidentally rational being implemented in C is the number one reason why Home Run provides less of a speedup on Ruby 1.9. So the use of rational to store date times at least makes sense. For date objects I really don't think it does. I mentioned earlier that dates in the standard library are stored pretty much as date times. Which means that if you add a fractional part to a date you may get something that looks like the same date or you may get something that looks like a different date. So let's say you take a date which we might date that today. If you add half a day to it you'll appear to get the same date. If you add another half day you'll appear to get a different date. Now the second piece of data that all standard date and date time objects store are stored in C. Now for date times this makes sense. As they need to be able to store an offset they need to be able to accurately compare date times in different time zones. However, I don't think dates themselves should have an offset because a date is not a particular time or a particular zone. It's a 24 hour period usually not particular to any one zone. So in general if you need to be talking about a zone you'll probably want to be talking about a date time or a date time. A date object for a date object is only zero unless you want to call some private methods. Now another strange result related to offsets is when you mix date and date time objects in calculations. Take the beginning calculation. You have a date time the day at new local time. You subtract from it a date object for the date's date. What do you expect to receive? Anyone? Well my personal intuition is that a date is half a day ahead of the time or ahead of the date. But that isn't what you get to stay in the library. You get three quarters since the date is in UTC and the date time is in local time. So what it does here is it converts the date to local time making it 6pm yesterday which is 80 hours or three quarters of the day before today. So let's say you go the extra mile and you want to be trying to ensure that the date you create has a time or an offset. And the offset will be ignored. So in short, mixing dates and date times using the standard library is going to result in problems unless your date times are also in UTC. So the third and final piece of data that the standard date library stores is the date calendar form being used for the given date. Now the date calendar form is the date to switch over from the old Julian calendar to the current Gregorian calendar. The Julian calendar is first used in 45 BC similar to the current Gregorian calendar except that it had leap years in years to visible by 100. Now on October 1582 to grade 3 to 13 the issue was called a pop-out bull specifying that the Gregorian calendar would be used henceforth. And the Gregorian calendar made years to visible by 100, but not 400 non-leap years. In order to correct for the accumulated air they skipped 10 days so that the day after October 4th to 15th. Now most of the half of countries adopted the Gregorian calendar quickly. England in its Commonwealth adopted it in 1752 and Russia did not adopt it until 1918 which is the reason why Russia in October of the Revolution was actually celebrated in November. Anyway, standard date library allows each type object to have its own data calendar form. That's right, a date object is not to store whether it is Julian date or date should be the data calendar form in reference to that date so that you add or subtract some days or months to or from the date it knows when to automatically switch from being a Julian date to being a Gregorian date for vice versa. You think about that, is that really necessary? I think it's about as helpful as a sixth finger. I'll take a step back and ask who really cares about the data calendar form at all? I don't think anyone cares. That's not going to be the case. I mean, I think Yeseo Ova and Karyoshi Funaba, they probably care. Anyone dealing with historical dates that uses the Julian date calendar probably cares. All of the vast majority of Rubyists I guess know that do not and should not care. And why should 99% of us suffer to make the job with 1% easier? Wouldn't it be better to have separate classes that deal with other calendars? It's not like handling old Julian and Gregorian calendars in the same dates as anyways. According to Wikipedia, there are over 40 calendars in active use and over 20 million historical calendars no longer in active use. So we talked about the issues with the standard date library in terms of the data structure. And that's most of the reason for the slowness and then stained-shading date objects. In many cases, the algorithms that the standard library uses and the ones used by Homerun are pretty much the same. So the date string f time instance method is known to be quite slow. The use of the block form of the string g7 instance method and breaks every match into three parts to build the hash of options for every match and for each of the recognized formats calls another method to create the replacement string and those methods are not fast either. Now if this wasn't bad enough the string f time method also calls itself recursively in many cases including the case where the default argument is used. Now I'm not saying this code sucks it's actually pretty decent in terms of reuse but from the performance standpoint it's pretty bad. It also lacks comments and it's not really easy to figure out what is going on. I posit that it's not a very good example of a variable naming and since these options hashes are not used in this method itself they just have to trace the code and figure out, you know, remember what's going on. These options aren't documented anywhere and I don't think anyone actually uses them which is provided for the waste time period what they do and it's the reason why we're not using them. So like string f time the date string p time class method is another method with no performance issues. One of the reasons it is slow is because it uses a date format bag instance to hold some temporary data and this class is actually just like a hash except it's much slower. The only benefit is that you can use regular name methods instead of using the hash getter and setter methods. So later on in the method it calls the string p time class method which is the majority of the work. This method looks very similar to the string f time instance method. We've used this block form on the string scan instance method with nested regular expressions and recursion that allow for reasonably succinct but poorly performed code. This is the slightly nicer API but unfortunately the code days are performance penalty for using it. If you use a plain hash for temporary data and use the standard hash method like this gives you an instant 25% performance improvement. I submitted a simple patch to business but it was rejected by Tadiyoshi without a clear explanation as to why. And now I'm not slow enough. The final slow method I'd like to discuss is the parse class method. And this starts off similarly to string p time, duping string and creating a new format bag just to hold some temporary data. It's just as I know that here in parse using a plain hash instead of the format big the format bag exposes them about 15%. Alright, I am actually a little bitter about that. And maybe that's because I think that speedup is greater and parse is used a lot more than string p time. Anyway, back up on it. After creating the bag it tries to remove unwanted characters from the string and parse out a time component and a day name from the string. I won't go into details about the parse time and parse data methods, but they both use regular expressions for parsing. After that, it tries to parse out the date using a bunch of different data formats in the serial. So if the string matches the regular expression used by parse ddd at the end it's parsing is a lot slower if it matches the string used by parse eu at the top. And almost all of these regular expressions are at the end of the string. So in every step it scans the entire string when it will match. There's actually more stuff after this including the more regular expressions but it's not the most interesting code so we should probably stop here. Because this is a presentation about homerun and I've already spent way too much time talking about the standard library. So let's step in and design a homerun and why it's so much faster. I'm going to trade off to that result. I mentioned earlier the number one reason is because of the choice of date structure. And it follows that the main reason the homerun is much faster is because of the choice of date structure used to store it. Now this is the city of the structure that homerun stores dates in. And one of the main reasons it is faster is that it stores the year month and date information directly. So what will I bring you on to point out? So if you're on stanching a civil date used a year month and day it does not have to convert it it stores the julienate separately as a long integer. Now the julienate is necessary for some calculations such as adding or subtracting a different number of days from a date. Now the last case is a flags object which just stores whether or not the JD field or the year month and day fields have been filled in. And because the information can be stored in two separate ways there are two conversion macros used in most of the internal functions. These conversion macros check the flags field if the flags do not indicate that the needed data has already been filled in it calls a conversion function to topulate the year month and day fields using the JD field or vice versa. So let's jump back to the data structure and notice the absence of a few things. First, there is no storage of fractional dates. That's because in homerun dates are dates. They're not date times in disguise. Also missing is a time zone offset. Because a date unlike a time is not specific to a time zone homerun does not store an offset. Finally, the date of calendar form is not stored because homerun always uses the Gregorian calendar and then fork in a form storing it. Now there are a few tradeoffs with this method of storage. The first is that it could be more memory efficient but either not caching the Julian date or the simple date packing the month and day information inside the year and doing a year of those would be more CPU intensive and since Ruby takes about 20 bytes of memory for every object saving 2-6 bytes, this doesn't make a whole lot of sense. Another tradeoff is that because the c structure is used there is a range limitation. Homerun has only about a 10 million year date range of limited systems. I find this limitation is not an issue for most people especially considering that each year a Korean calendar means that homerun is not suitable for dates before 15-22. Now homerun uses a different structure for date times. It has the same fields as dates but adds additional fields for storing time and the time zone. Similar to how the year, month, and day fields are stored separately from the Julian date the date time structure stores the time component of the date in two separate ways. The first is the nanos field which stores the number of nanoseconds since midnight as a 64-bit integer. The second is the hour, minute and seconds fields. Now storing the time component of the day time is done for the same reason because there are two separate ways for dates. It means that the user gives the hour, minute and seconds does not need to be converted before being stored. Now one trade-off here is the number of nanoseconds where the given second is not cached so that this has to be calculated from the nanos field if it is needed. I find that it's not even very frequently and it is a fairly simple modular calculation so I do not think it was worth adding another field to the structure. Now another trade-off is that because we are storing the nanos as an integer the lowest resolution for home and day times is a nanosecond. I'm considering that when we time objects also when we store nanoseconds in Ruby 1.9 and microseconds in Ruby 1.8 this seems like acceptable. Now there is a small negative side effect for not handling fractional nanoseconds. For example, if you use the step instance method with the step value of 1.7 the final nearly object here will show up as a different date because the object will represent the nanosecond of today's date instead of the first nanosecond of tomorrow's date and that's because 1.7 of a day includes fractional nanoseconds which get lost at each step. Finally, the time zone offset is stored in a 16-bit short integer as the number of minutes difference from UTC. Now one trade-off of storing the offset in minutes is that time zones with fractional minutes are not supported. I don't believe there are any time zones in active use that use fractional minutes like librarian time and answering time did. Now another issue with home runs during the offset in the C structure is that there is also a range limitation. Home run actually enforces a 14-hour maximum offset from UTC since that is the largest offset in current use. Now this is very different than standard library which will accept any offset. Standard date library will accept an offset that is more than a year in terms of time. Home run will recognize that there is an exception. Another trade-off is that because two separate structures are used and they have different layouts you cannot use the date and date time structures interchangeably in C code. This means that many methods that are defined in date and inherited by date time in standard library need to have separate versions written for both date and date time. Now the final trade-off I'd like to mention is that because a C structure is used and date and date time objects home run does not work with the allocate class method and for the same reasons it doesn't use Marshall dump and Marshall load instead of including the old style dump and load methods. So now that I've talked about the data structure of the standard library that home run uses let me talk about the algorithms it uses that make it faster than the standard library. I think we've all heard the phrase that no code is faster than no code. And one of the benefits of home run's data structure is that you do not need to run conversion algorithms in many cases. Let's take this date which adds two months to today's date. Now the standard date library needs to convert the date to a Julian date in the constructor, convert it back to a civil date to add the months, add two months to the civil date to get a new civil date and convert that back to a Julian date and it will restore it. With home run the initial storage is done directly with the civil date and the addition of months uses the civil date so no conversion to more from a Julian date is ever done. I mentioned earlier that most of the actual problematic algorithms in the Ruby standard library are the conversions to and from strings. A home run's approach to these methods is to avoid the use of regular expressions and it keeps the code simple with an eye towards performance at the expense of some robustity. So here's the code taken from home run's implementation of the stringf time instance method. It's basically a simple string scanner. It loops through each character of the format string. If the character was preceded by a percent sign the mod flag is set and there is a simple switch statement on the following character. Each modifier character has its own branch in the switch statement and almost all characters are handled by a construction like this. Where S-print F is used to append characters to the return string. For those of you familiar with C this may look slightly unsafe taken in isolation but home run does sure that there are at least 64 characters in the available buffer and none of these S-print F calls should produce more than 64 characters without them. So the cases where compound modifiers such as capital F are used are handled by specifying all the arguments in the S-print F function call. This is faster than using a recursive function call but it is a little bit more robust. Now the string p-time class method is handled very similarly to the stringf time instance method but it is a bit more complex. Instead of using S-print F to format a string this is S-print F to read from the input string and assign values to local variables. If the values are assigned correctly and they are valid the state variable is updated with the field in the set. Now some format modifiers that can be used in compound modifiers such as lowercase n for the month is a little bit more involved. In order to avoid a ton of redundant code macros are defined and immediately called. This allows compound modifiers such as capital F which parses the year, month, and day in an ISO 8.6 format to be written as a series of macro calls. And the reason it is having macro calls instead of calls to another function is that all data is stored in local variables instead of a C structure. Now home run does actually borrow some code from the Ruby standard library. Because of the complexity of the standard library's parsing code at the term it would be very difficult on an air prone to rewrite agency and keep full backwards compatibility. So home run uses a modified version of the standard library's parsing code with two important modifications. But the simple modification is just using a plain hash instead of the data format that I get which provides an instant 50% speed up without any drawbacks. The second and more interesting part is that before attempting to do any parsing in Ruby it calls the raggedale parse method. Now this method is implemented in scene and uses the raggedale state machine compiler. Hopefully most of you have heard of raggedale. It's sort of a famous new community by Ced Shosh who is a bit in the mongol HTTP parser which was also used by Thane, Uniform and other Ruby web servers. It was also used by Y as the scanner for HVPy. Anyway, if you haven't heard of raggedale or if you haven't just aren't sure what it is raggedale basically compiles a state machine sort of like a regular expression that you specify using the raggedale domain specific language. It's more powerful in some ways and it's less powerful in others. It's more powerful in that you can embed arbitrary code at any point in the scanning process. And it's less powerful in that it doesn't support pictures like backtracking and if you want to get the equivalent of submatches you have to implement them yourself using actions. So implementing submatches is a common even, the most common even in the state parser. So let's just take this first line. A raggedale machine named CLF year that will parse four digits optionally preceded by minus sign. Now raggedale supports numerous state machine actions for each machine and one of the most common actions is the interring action which is called when the machine is entered. So when the CLF year machine is entered the action tag CLF year is called. That action just sets a local variable that I have inside the parsing function to the current value of raggedale's pointer and this marks the beginning of the submatch. Most of the other machine entering actions are similar. Now in some cases you would also want to add a finishing action that would mark the end of the submatch but that's not even in most cases in the state parser. So most raggedale machines are built out of other raggedale machines such as this one named CLF date time which is a machine for the entire Apache common log format. This machine accepts the CLF date format optionally followed by the colon space. So the input stream ends in one of the CLF date times final states. This set parser's CLF action will be called and this action sets the parser's global flag variable to the CLF parser. So after the raggedale parser finishes there's a switch statement on the parser's flag variable and if the CLF parser matches fully it will execute this C code. Now most of this C code is just taking the same pointer positions for all of the submatches in the stream which were set by the machine entry actions and converting them to long integers. And then it sets flags from local variables to which it assigns values. Alright, those flags are used later in order to set Ruby values in this return hash. Here the year local variable is converted to a Ruby integer and assigns the value of the year Ruby symbol key. Anyway, I probably covered that way too quickly but you could probably have a whole conference just about raggedale. This is the basics of raggedale's capabilities but in anything's home run the parser many times faster than the Ruby standard library if you're using one of the formats that the raggedale parser understands which is currently isolates 601, RFC 2822, HTTP and the Apache commonlaw format. So thanks to a better data structure and faster algorithms what kind of speed-up does this run give you? Well in sanctioning data objects it's about 14, 66 times faster than the older version of Ruby. For daytime objects the instantiation is 17 to 146 times faster. String graph time is 62 to 104 times faster. String P time is 23 to 71 times faster. Parsing is 25 to 56 times faster for formats supported by the raggedale parser and 50% faster for other formats. A calculation such as adding or subtracting and subtracting is 120 times faster. So with the number of times that home run gives you if the standard date library is a slow car then home run is a rocket. And that's not really fair unless the only thing your app does is handle dates. In my own application I've seen speed increases of 3 to 4 times day-based queries just by using home run. So if the standard date library is a slow car and the home run is the blue dotty veron super sport, the world's fastest free legal production fund. So home run is not just faster than the standard library, it's also more memory efficient. Even though it stores the date and time information in two separate ways by tightly packing the information in a C structure it's about 2 to 6 times more memory efficient than the Ruby standard library. In addition to being more memory efficient home run also creates 11 to 66 attributes to its speed. Now home run was mostly written for performance in attempts to have an API that's as close as possible to a Ruby standard library. However there is one mistake that the standard library makes that I think is so bad it must be correcting and the mistake is how it parses a date like this. Now in Ruby 1.8 this is parsed successfully as Christmas. And if you upgrade to Ruby 1.9 you'll find that Ruby has stabbed you in the back. Since parsing that date will not raise an argument here. Because Ruby 1.9 switches a slash separated date format to assume day, month, year instead of the previous month, day, year breaking the parsing to the most common in the day format. Now I'm not saying that month, day, year is correct because a lot of other countries do use day, month, year. But for Ruby to break your code and give you no easy way to fix it is very troubling. Now thankfully home run by Domino Man so using this cryptic code you can once again have correct American date format parsing 101.9 and while Ruby 1.8 library will correctly handle slash separated dates in day, month, year format you can use this code home run and you just get correct date parsing on the new money. So at this point I'd like to shift yours again and just talk about a few things that I learned while writing home run. As I mentioned earlier one of the reasons I wrote home run was to get more nice writing non-tribal Ruby C extensions. And one of the reasons for this is that I really didn't know C as well as I wanted to and that showed in a few places otherwise there were a couple of things that turned me out. For example, consider the following expression which is the module of a negative integer by a positive integer. If you listen to the program in Ruby you're probably used to this for turning positive 6. But in C90 the result is implementation dependent and in C99 the result is negative 4. So in this program the module is used in various places like determining the date of the week, given the Julian date which should be returned as a positive integer. Since negative Julian dates are supported I ended up having to write my own module function in C to emulate Ruby's but even if a positive number of positive numbers will always be returned even if you'd be dividend as negative. Another thing that tricked me out was the formatting of 64-bit integers since for that time. I was originally using code like this until I tested it on Windows. When it turns out that on Windows this is not supported. If you're visiting a portable code you have to include the int-types-h header file and use this format string. And this actually looked a little bit odd to me since it's a little string that looks like a constant. And the reason this works in C is that PRI-364 is actually a macro that is resolved to a little string by the C3 processor and C automatically decatenates and I remember when I was writing my first few Ruby C extensions I had a lot of problems with memory leaks and I decided to take an approach to memory management in online that made it so I didn't really have to worry about leaking memory. Now almost all Ruby programmers know that Ruby has a garbage collector that makes it so you don't really have to worry about memory management. There's basically only three ways to leak memory in Ruby you either keep a reference to an object that you need to use a C extension to leak to memory or you can find a button to use as a leaking memory. Now leaking memory in C extensions is very easy. Here's a simple student example to take a C string and you want to create a Ruby string for the first three characters. This leaks because string n-duke calls malloc which allocates new memory but the return value is passed to rb-strand new 2 and never free. So basically if you are a Ruby C extension programmer malloc is your enemy. And how do you fight session enemy? Well you replace your enemy malloc with your friend the Ruby garbage collector. If you let the Ruby garbage collector do all of your work you never call malloc or another C function that calls malloc your C extension cannot leak memory. So for instance write reference other C extension writers may think this is inconceivable. But why are there those that you can use? So how do you write a C extension that does not leak memory? Basically it never calls malloc directly. Well there are two techniques really simple that a lot of you do this. This is just to use local variables allocated on the stack whatever you can. The first C program. In this example instead of a pointer to a Ruby data structure you are going to use the entire structure is used and the address of that structure is passed to the full commercial function and since the structure is allocated on the stack and not on the heap the memory is automatically free when the function returns. Now if you can't allocate on the stack the easiest way to avoid leaking memory even if that Ruby object is only used temporarily. And a simple example of this is in the string at time function. A home run first allocates the initial Ruby string for the output string using rv string above new. Now this function creates a Ruby string with zero length but with a C buffer with at least that number of characters. So then a home run grabs a pointer for that string's buffer. Every time it needs to expand the length of the string it creates a new Ruby string twice as long and copies the data from the current string and then switches the pointer over to the new Ruby string's buffer. So the final Ruby string created can be returned to the caller of the method and all other temporary strings will be garbage collected for the next time Ruby garbage collector runs. So I'd like to close out the presentation with a short discussion about how the development of home run has affected the greater Ruby ecosystem. Home run was originally developed and I supported Ruby 1.9. And during the development of home run I found bugs and a couple of corner cases of dates and date times methods and I submitted patches for them which were both applied fairly quickly. I had to report home run to Rubinius and because Rubinius did not support all the C API features I was using I had to submit bugs and patches to Rubinius project in order to add support to those features and those were also applied quickly. So Rubinius became the second of the limitation and finally I took it to get home run working with JRuby's C extension support and this was back in late August before the C extension support had been finished and merged into JRuby's master branch. And after Rubinius I had to add some extension methods that were not yet implemented and my passion for that was to apply quickly and then JRuby became the third of the implementation to support home run. It even caused some changes in data method which was using Rational without the right to require it and that works if you're using a standard library since the standard date library requires Rational and the breaks if you use home run since home run does not require Rational so big back developers agree that this was a problem when I reported an issue to them and now the knowledge adapters require Rational to record. So as the sun sets on this presentation I actually remember just three things data structures can be more important algorithms malloc is your enemy and try not to be bitter when the patches are not accepted and that wraps up my presentation much the opportunity to present here I'm happy to answer any questions you have there.