 Hi, everyone. I'm here to talk about Jyulipsi. I've tried to put this together as a sort of call to action for a first-time contributor or a fairly new contributor. But I'm also hoping that maybe it's useful, at least entertaining, to people who are experienced programmers and already know their stuff. Who am I? My name's Arjun. I'm an upstream Jyulipsi contributor. I also co-maintain Jyulipsi in Fedora and Red Hat Enterprise in Linux, which, of course, means that I work at Red Hat. So I'm going to try to, this is the last talk. I want to make this really quick to talk about whatever I have to say and leave some extra time for questions and answers. I hope I make it. It's really my first talk, so I'm not sure how I'll get the timing right. But I'll try my best to leave as much time as I can for questions and for people who want to maybe leave or leave to party. So I'm going to go with an introduction. I talked about seven steps in my title, so I'll quickly jump into the seven steps. And then I'm going to walk through a patch, a Jyulipsi patch, a recent Jyulipsi patch, just to kind of show what kind of goes into writing a patch. And then I'm going to talk about what you could do to contribute to and help with Jyulipsi. And then, of course, questions at the end. And we're off. So I was first introduced to the C language in high school. I knew a little bit about stdio.edge and malloc. And to be honest, it was mostly magical incantations that I wrote in the middle of my program to print and to get input and to allocate some memory. I really believed it was something that's happening within the compiler and kind of didn't really think too much about it. But very soon I realized, string length, for example, is a function that can be written in C. So it's not really magic. There's something going on there. Obviously, eventually, I got into this field and I know a little bit more about all of this. Well, not all, but some of it. It's a pretty wide topic. So I know that a lot of it is actually mostly written in C, mostly written in C. So what is Jyulipsi? It's the standard C library. We have all of those functions and a lot in between. For example, one might think that main is the first thing that starts executing when you start a program. But there's quite a bit that goes on before that, that is the loading of the program, placing everything in the right places, fixing up function addresses, and then finally you call main and then you go on with the execution. So there's a lot more to Jyulipsi than just providing the functions. It is the runtime. Okay, so next, we're gonna talk about why it's maybe useful or why you might want to contribute to Jyulipsi. I think the first reason, at least for me, is that it's very high impact. There are millions of installations. It is the C library for the majority of non-Android Linux based operating systems. If you make a change in malloc, for example, that shows up on the critical path, you're looking at trillions of executions of the code that you wrote in a week or something like that. I don't know the exact numbers, but you can think about every time malloc goes through, your code's in there doing something. So that's what actually makes me very happy about being able to work on this stuff. The other thing is we actually have a fairly actively developed piece of software. It's not really like some arcane old thing that's never really updated. We have over 1,000 commits a year, which means we regularly add bugs that need to be removed. My personal experience with the community is that it has been very welcoming to me, very kind. Mistakes are welcome and accepted and understood. We all commit bugs once in a while and help each other out to fix it. It's like any other piece of software really. I think recently is not really accurate, but I think it's been more than a year already. We have weekly public video patch review meetings. So people who have recently submitted a patch can actually show up to that meeting. There's a link somewhere in our wiki. Show up to the meeting and talk about your patch or say that you want review and someone will be assigned to look at it. A lot of the regular attendees are basically regular contributors. I don't show up very often to that meeting, for example, but once in a while I do and try to find or get assigned like maybe a beginner's patch because I do care about this. I do care about, it makes me happy to, it's first of all easy to review a patch by a beginner and it makes me happy to see that we have another person who made a contribution and who might make more contributions, right? It's got this kind of multiplying effect. So we have all of this and we do look out for patches from new contributors and we also have a code of conduct that is a work in progress. So yeah, we do care about being welcoming, be kind to everyone and trying to get as much as possible contributions from people who are willing. Okay, I said seven steps. It was a clickbait, to be honest. I didn't know how many steps it would be. I don't think there's a fixed number of steps but I did shoehorn the seven steps here. You can see them. So you do a git checkout and this is a bit of an idiosyncrasy. You need to be building in a separate directory and you can't really build in the same directory as the source tree. I don't know the reasons for it. Something to do with the build system. I don't know most things about Jylipsy, to be honest. So you're building a separate directory and you need to make sure at configure time that you provide a prefix which is about where it's going to be installed. It's just a couple of idiosyncrasies to building Jylipsy that don't exist in a lot of similarly packaged applications. So let's say that you're trying to fix a bug. A good place to start is by maybe adding a new test that fails without this bug having been fixed. So that's a potential step three. You implement the fix. You do some testing. We recently added a couple of recent is always a bit of a longer time span when it comes. But maybe I think a couple of years ago, maybe slightly more, we added a couple of scripts to help you run a program, not on the systems installed C library, but on the one that you just built. So there's test run for that. And then there's, you know, you need to do a lot of special things to get GDB to pick up the in tree freshly built Jylipsy sources. Sorry, not sources at this point, of course, executable. And then run it with a test program. And so we have a script to help with that as well. And you can use that to kind of check out how your test and your fix is working. And then eventually you delete the build directory, you reconfigure, you run make check, you make sure everything's working fine. And perhaps you submit a patch to libcalpha at sourceware.org. That is our mailing list. We submit patches there. We discuss patches there, right? Is that the end of the talk? Maybe not. So, I will now go into the anatomy of a relatively simple patch. I try to look for something that is on the order of magnitude halfway between a fixing of a typo in a comment to like a entirely new, let's say, feature that changes, you know, dozens of files. Somewhere in the middle of that is this. It's a fairly simple fix that went into gilipsey recently. My colleague Florian, who is an experienced developer and a prolific committer in the gilipsey sorcery, he submitted this a couple of days ago. I reviewed it for him. And it's basically a fix to a function called stir error. And apparently this function must not return null. We'll go into why. So for some reason stir error, the function was returning null in some cases and it's not allowed to do so. And then Florian goes on to explain that we made a recent change where stir error was implemented in terms of another function, stir error underscore L. And that's what caused this sort of regression that needs to be fixed, right? What is stir error and stir error underscore L? So stir error is basically, if you know error number, it takes a number which potentially is the error number and it returns a string corresponding to that number describing what that error might mean. Obviously, if you put zero, you get success. And I did not know this and I really don't know if this was a joke or if it's, I don't know, divine intervention, but stir error 42 returned no message of desired type. I have not read the book, but 42 is apparently an interesting number and I thought it was funny that this is the reply you get when you try to find out what 42 means. So stir error underscore L is a very similar function that returns a string in the current locale, which might have a different language than the one, in a given locale, sorry, not the current locale, the current locale is the one in which the program is running. So the program is running in locale x, you want the error message in locale y for whatever reason, you use this other function. Obviously, you can now see why stir error can be implemented in terms of stir error underscore L. You just pass the current locale and you get it back. So someone made that change, which was definitely an improvement, right? We don't want to duplicate code, but it caused a regression and why was it a regression? Stir error is not allowed to return null, but stir error underscore L actually is allowed to return null for some reason. And the details of that are actually in this little bit in the POSIX documentation for these functions, right? POSIX tells you what these functions are allowed to do in what circumstances. And basically, the POSIX manual says that whether successful or not, stir error must return a pointer to a generated message string. But stir error underscore L does only need to return it upon a successful completion. If it's a fail, it can return null. So the moment we made stir error use stir error underscore L, we started having the same behavior that it was sometimes returning null. That was the bug. So now we're actually looking at the patch that Florian wrote. It was a test and a small change to the code, right? I really want to show that, I think my idea here is to show that a patch is not so complicated here. Like you can see this and see that it's not, I don't know, like arcane magic, right? So the test includes some usual headers. You see a few headers called support slash something. Those are actually part of the GLIFC test rig. When you run the test suite, you can actually use some helper functions to do a lot of things, like implement some checks, do some error checking for functions that you're not testing and so on. So we have some helper functions for it. The patch has a test. This is where the, these are the headers that the patch had. So first, we're going to look at the test itself. So I think I should be using this at this point. Okay, is that visible? Yeah. Okay, so it's a function that tests this stir error and what it does is it sets a variable called fail malloc, which when you turn it through, apparently some malloc is going to fail. Every malloc from here on is gonna fail, right? And then you call stir error with a weird number, which obviously is not a regular like result that you get for an error. And then you get the result for it and then actually you stop causing malloc to fail because you don't want the rest of the test to stop working, right? Only for the call of the stir error, you want malloc to fail because we know that that is the point at which stir error was returning null. It was trying to do an allocation and it would fail and it would just return nothing. So we got malloc to fail, we got a result and then we're actually checking that this result is the same as this string unknown error, which is the default string for any kind of error that you don't really know what it is. So we expect this. So this test compare string is actually from the test rig. Like I said, we have the support directory where you have all of these helpers. It's one of those helpers is this. It just compares strings and it logs an error if they're not equal, right? Oh, I forgot I can use this to change slides. So now we're looking at the test itself. So the test rig actually has this bit where it makes sure a test will not run forever, right? If the test was a main program that never returned, then you'd run make check and it would be stuck running one of these tests forever. So what we do is for tests that we have in G-Lib C, we require you to actually define this function called do test and then write all of your testing inside it and then you just include the rest of the test rig which has a main function and that will actually make sure that these do tests doesn't run for more than a couple of seconds. So that test will return or if the test gets timed out then it actually errors out and we know that this test hangs for some reason. So that's pretty much all of the test. So Florian did include another test first error underscore L but that's not important. I just wanna talk about the one piece. So that's what I bolded here, right? And now we move on to really the end of the test itself which is the malloc, which we said that there could be a malloc that fails and this is the malloc that was in the patch. So it's a malloc which will obviously take precedence over the malloc in the C library. If you define this function in your test, it's gonna get picked up before the G-Lib C's malloc gets picked and what it does is if this fail malloc is true, it returns a null, which means it won't allocate otherwise what it does is it goes into the G-Lib C dot SO and it picks out the actual malloc and it asks it to do an allocation, right? Because we don't want to re-implement the whole malloc, we just use the one that actually works. Okay, we finished writing the test. How do you add a test to the G-Lib C source string? This is a string-based test. Stir error is a string function and so we have a directory for that. You just go to the make file and there's gonna be this little line called test and then you just add the name of the test here without the dot C. It's literally just that. You write this function, you don't write it as main, you write it as do underscore test and then you add it here to the make file and you have already a test that will start to fail. Right, without the fix. So now, and I really hope this is visible because it was really, really hard for me to split this among multiple slides. So now we're looking at the fix itself, right? I'll quickly go through this. So we have stir error underscore L. Like I said before, we implemented stir error in terms of this function, right? So this is the function where the problem lies now. I know it's allowed to return null, but there's nothing wrong if it stops returning null also, right? We can do better than what the standard requires. So we'll now make both of these functions not return null in any case. So we have this error number that we get. Ignore that bit, it's not about Erno. Erno is something else. Let's not think about that right now. So we have this error number from the user and we wish we need to convert to a string. We go to the error list. Really, let's not care about the details. We go to the list and we get the number and we get a string corresponding to it. And if we don't get a string for that number, then we know that it's an unknown error. We don't know what this error is. If we knew what it was, we'd have it in that list from the get error list, right? So we're in this piece where we don't know what it is. If we know what it is, then we just translate it to the locale that is requested and then we just return it, right? So this is what the code looked like before. Here on the right side, this is what the code looks like after the fix. Everything else is the same. It's just that this bit where we don't know the error, it got fixed and so what got fixed there? What got fixed is we were trying when we know that there's an error number, let's say 999, right? That's what we had in the test. We're trying to return something like unknown error, space 999, right? We're trying to create like on the fly a string with the number also so that eventually when it shows up in the application somewhere, that number is not completely lost. That number is still there. We're trying for it. So what we were trying to do was we're trying to do an AS printf which basically it allocates memory and it prints into the memory whatever you want. It's basically like printf, but it'll create its own buffer and it'll print into a string, right? So we call AS printf and whenever AS printf return minus one because it couldn't allocate memory, we were returning null, okay? And what we do now is okay, so we tried to return unknown error 999 and AS printf failed. So okay, Florian changes a bit. We were looking for a minus one for failure. Now we are looking for greater than zero for success. Why is that? Because AS printf returns a number of bytes that were written. So if it wrote a few bytes, we know that it succeeded. If it wrote zero bytes, we actually know that something went a bit odd there, right? It didn't write anything. And if it returns minus one or any negative number, then of course that's also an error condition. So that kind of got reversed, right? Now we had the error condition first and then we had the success case. So we changed this around a bit. So if we succeed, we set the return string to the one that we got from the printf. But if we fail, what we do here is we'll turn simply unknown error without the number, right? Which is a static string. We don't need to allocate anything for it. It's just unknown error. It was part of the binary anyway. It's just gonna get returned. So that's the fix, right? That's the fix. And now we look at this and we're sure that we're never gonna be returning null. Okay. I do wanna pause here and ask if this is completely gibberish or it kind of made sense? Kind of made sense. Does it, are there any sort of, let's say, see beginners here who see this and feel like, okay, that's not too hard. It's not arcane magic or maybe? Okay. I think that's what I was hoping for. I was just hoping to show that, a glibc patch is not all that, right? It's not all that. So that's the fix. And that is actually the entirety of the patch. I know I kind of formatted it and showed it in a slightly different way. That was Florian's patch, which fixed this bug. And that's actually what I really wanted to make a point about. Now I wanna talk about what you could do to contribute to glibc as a, well, even if you're super experienced in something else or also super experienced in glibc, but you're bored of doing something that you usually do, there are a lot of things that require relatively less knowledge of the internals where you could make a difference. So the first one is that you could write new tests and you could improve old ones. For example, we have the, so Fedora, I'm also a Fedora contributor. So we have the Fedora glibc package and then we have some CI behind it which runs a lot of tests. Some of those tests actually I wrote a few years back and didn't upstream them for whatever reason don't hate me. So we run the CI and which a lot of those tests are not really upstream. The reason for that is that some of them require setup or altering the system's kind of configuration in some way and those are not the sort of tests you want to include in a test suite for application where you're messing with the user system. You probably won't even, it'll fail because it'll try to modify nsc with switch.conf and it can't, right? But since then we actually have containerized tests in glibc. So I'm gonna go back here. So instead of adding an entry to this tests line, you add it to a different line called tests-container which you can write a containerized test, you can make a little directory containing all the files that you want inside the container and you can actually write a test that does a bit of setup or has a bit of setup that modifies the system but it won't modify the system, it'll run inside a container. So you could do some tests like that. You could look at the CI test for Fedora and then upstream them as containerized tests, for example, right? You could write documentation. I'll come to that, I'll come to that. Actually I just attended a talk earlier today where it was about beginners feeling like how it feels to be a beginner in the open source community and actually the speaker there said that we should have good documentation. So I feel bad about that. That I'm saying right documentation is a beginner task but okay, for people who are a bit into GDB, there's this whole thing called pretty printers. I think it's a, I don't know much about it but I think it's like a Python thing where you can write pretty printers and we have lots of like these opaque Glypsy data structures, maybe bits of the Glypsy heap, maybe there's a way to like walk through the heap, maybe some other opaque types like a lock is a lock locked because if you try to see like a P thread mutex type and see what its values are, like it doesn't say well it's locked enough, it's gonna have some numbers behind it. So you could write a pretty printer that actually just prints like okay, this lock is currently locked or not. Occasionally when you're just like reading code, sometimes you'll notice like it's changed a bit. So the comment is a lie and nobody caught it at review time. You know, it feels like, it often feels like it's not a value but honestly it is, right? It is of value. There is value in fixing that. Obviously it's not so glamorous but it helps. We also have a bug tracker where you could you know, sort by new and maybe like confirm that this bug actually happens for you or you know, try these bugs, see why they happen. All sorts of stuff. Some more relatively more specific ideas. So you could optimize the integer to string conversion in printf. You could rewrite the base 64 decoding encoding. We also don't have info pages for DL open and a lot of pthread functions and some DL open and related functions. You could write those explaining the details of how the G-Lib C implementation handles these things. And we have this mtrace, which is currently a pulse grid but you could convert that to C. So these are like concrete things. Just mentioning like which we could use help with. We also have arc and pulse scripts that we picked up at various points of time. And we could kind of standardize on Python and kind of reduce the, no hate for these. Just reducing the number of things that we need to build G-Lib C. Yeah, so a lot of things to do. I promise to leave a lot of time for questions and I see it's only five minutes. I'm sorry about that. But I hope some of you will stay and keep asking. I'm here to answer anything that I can. Final kind of links. We have a wiki, it's very out of date to be honest but it's still useful. We have the bug tracker and we have the development meeting list. And then this one is, I would say really nice, a Lib C help for people who don't want to, like don't feel so comfortable like posting a patch like from get go but need help with something. Like maybe you want to ask what you could work on. Maybe you want to talk about something you're trying to work through but you're having trouble. So like everything Lib C related is sort of on topic here. You could write there and ask questions. You could also, hey, like just sort of kind of sort of a beginner here myself to be honest. It takes years working on this stuff and still feeling like you don't know much but you could write to me and if I don't answer you I'll point you to someone who can answer you. So now questions. So I'm just gonna repeat the question which is that for a lot of application level stuff there's like a more modern, I guess, repository and like a contributing.md, possibly on GitHub, and Jellipsy doesn't have that and also a lot of people take it for granted that it's probably works good enough. I guess that's what you want to say. And so how do we hope to have more contributors? To be honest, this is a question I do not have an answer to. It is quite an old piece of software. It is well established. The truth is that there is, as I said, over 1,000 comments a year, I actually checked. Maybe not every release is 500 plus comments but the average is like over 1,000. So it's happening. I think a lot of the contributors do tend to be full-time employees of software companies in this field. That is quite true. It's a hard problem really, I must say. It is a hard problem too. I know it's just not as, like it's also not as glamorous as the kernel. Right? I don't know the solution to this. I don't know the solution to this. But you're right, I think it is sort of harder to get contributors into this than to some other stuff. So I guess that's my answer. Sounds good to me? Okay, so too. You can go first. Yes, so there are a lot of things you could do. First of all, if you found the bug, obviously you could file a bug report while you're working on the patch. File a bug report, assign it to yourself. You don't have to, but it's good to do that. But literally sending to this mailing list is all you need to do. You could be a drive-by contributor who does a patch that sort of fixes it, but has some issues. You could just send the patch here and walk away and never come back if you don't want to. And we might still actually work on it and write ourselves down as a co-author and finish the patch. I've seen that happen at least once fairly recently. Somebody fixed a bug in some, one of the G-Lib C utilities that ships some executable. I don't know which it was. And I think someone kind of fixed up the patch a bit and then committed it on their behalf. So that's it. It's a bit old fashioned, I will admit. So I use git send email. I also remember that my company actually changed the way like our email provider. And there was like a few months in between where I was nervous that am I gonna be able to send patches the same way as I could? And I was like using my private email address to send patches. But it's git send email works for me even now and that's the one I use. But you could attach it to an email and just send it here and it should be fine. We don't have full requests. So, sorry. Aha, okay, good question. I will come to you. So, good question. So, license agreement. I guess you mean copyright assignment. Yes, so you may assign copyright to the Free Software Foundation if you wish to. But it is not required anymore. You could do a developer certificate of origin. You don't need to assign copyright. So that requirement is gone. Yeah, it is fairly new, yes. I have the same. So, the question is are there any free tools to kind of help understand all of the function calls that happen there which are like, you don't know what's going on there. The answer to that is I actually don't. I have the same problem myself. I try to avoid looking at a lot of context and just trying to look at this particular patch, look at this particular bit of code. Okay, I'm out of time. I can continue answering questions but we will do it off this platform.