 Hello, my name is Michael and I would like to talk to you about RPKG which should be the next generation packaging utility, basically a tool that should serve its purpose for packages but also maybe for upstream developers and university students and so on, if they want to do any kind of RPM packaging or generally creating RPM packages from their software that they are working on. Yeah, I'm just going to maintain there and I also maintain a few packages for Federa in general so it's my background. So why did I start to work on this thing, sorry, why did I start to work on this thing in the first place? So like three years ago it was, I guess, I became a distgain maintainer. I basically packaged what was in Federa infrastructure repositories regarding distgain and its deployment and I made a GitHub package or RPM package out of it. And there were some issues in that repository so basically there were some issues in the repository regarding adding support for Git LFS or Git Annex. Instead of using like simple HTTP uploading and downloading, there were suggestions about using Git LFS or Git Annex or something more integrated with Git and more up to date. And these issues actually touched much, many more topics regarding, for example, work with unpacked sources like not only with a distgain repository that has a spec file, patches and turbos in it but also repositories that have spec file and next to it are also sources like that which have upstream form. And they were discussing workflows, they were also discussing people and these issues, they were also discussing work with submodules and usage of submodules and these issues actually gave me an idea to build a new client tool for this Git because lots of stuff that people were talking about there was actually not quite possible to do server side but instead I knew that I need to do this client side. So that's why I started developing it and also there were many distgain deployments, they still are many distgain deployments for CentOS and for Federa and Red Hat, internal and so on and I had this idea to build one tool that will work with any of them. So that's also a factor that motivated me in the beginning but again then I realized that this actually there are two ways how to do this to make a tool that is compatible with all possible deployments. One way is to put some hex into this client tool that basically changes operation based on with which deployment you work with or the other way is to provide some package that every distgain deployment will use some unified interface. So I went this way so that was like work on the server itself and that was basically finished by making RPM package out of this Git. Now the thing that is left is for people to start using it but that's not my problem. Anyway all the other points below that like average storing changes directly in spec files and so on that's why I started to develop a new tool. So this is like a problem that occurs a lot on mailing lists that change logs are long and they often duplicate what is in common messages or maybe in some other change log file in the repository. So I wanted to figure out how we can prevent this duplication and at the same time I knew about Tito and RDO, PAKYG and so on which are able to work with unpacked sources. So there was another motivation that I wanted to also support this use case because I wanted to offer the same as competition offers. And also provide automatic versioning based on Git text and Git commit IDs basically. And also I wanted to avoid any unneeded stuff that doesn't really need to be there and be clean in the solution which means I didn't want to put any extra files into the repository which are not part of the package itself which should be there only to support usage of RPKG or the package utility in general. So I managed to do that and I managed to do that but you can still put RPKG.conf into your repository and change some configuration for RPKG when it runs on that particular repository. So for example you can set up path to your own macros that are used then in the spec file I will show you later or you can basically change upload URL to this Git or download. So you will download sources from different resources than default but this is all optional. If you don't want anything there you can achieve that state. So I started to work on that and I tried to, yeah, one more thing that I should say, actually my vision was to create a tool that is used by university students as well as package managers or package maintainers. I wanted a tool that is basically recommended when someone has some university project for some course and he wants to make it a system installable package. So I wanted RPKG to be the thing that is recommended to that student. So people from university start working with RPM and basically start using it. So yeah, it should be easy to use. And then I tried to figure out how to do, how to basically implement those points, this one and this one, also this one and I found out that there is a common denominator in this that you can actually do these three points in the same uniform way by using spec templates and this is some extra language on top of standard RPM syntax that you can put into your spec files and it will make certain parts of the spec file dynamic. So for example it will generate version based on the latest annotated tag that you have made in your repository. And actually if you call git describe, git describe will return you something very similar to what I am returning. So this idea about the versioning software based on commit hashes and tag names was here for a very long time. Actually Linus himself implemented this command, so it's in git I don't know for 10 years, but it has been for a long time so it just needs to be put into some nice package that people can use and that is easy to get to anyone. So yeah, these spec templates I wanted to implement in bash because RPM has shell expansion macro basically which passes the expression or command, but let's say expression to bash it will evaluate it and replace output of that run command with the tag or with the macro invocation in the spec file. But there is a catch, you know, I was thinking about actually using this, just shell expansion which is already provided in RPM itself so that I don't need to do anything new, but there is a slight problem with this approach because if you make SRPM from such a spec file, this spec file is put verbatim into SRPM. And when you later rebuild this SRPM, it no longer has the git context around it. So it misses git metadata. So when you try to rebuild it and if you would have some shell commands there that would read git metadata, it would fail because the metadata are no longer in the SRPM itself. So RPM does not have a good macro for this. If there was a macro that is basically putting the dynamically generated spec file into the SRPM itself, I wouldn't need to do this. But yeah, this can be like, in my opinion, it doesn't really matter if it is implemented in SRPM or somewhere else. So I just implemented in package utility or packaging tool. And yeah, so I spent a lot of time on Besh IRC channel and later also on Git IRC channel asking about various stuff, how things works. And especially Besh channel is not exactly friendly to newbies. So you can see one of the first threads that I started there. It was basically about passing named arguments to shell function. So I will show you an example, yeah, thank you. So basically I wanted something like this. So you don't pass a free and prototype to some macro, but you name these arguments. And this is Besh because I just wanted to be similar, wanted to be compatible with what RPMs are already doing and basically do something similar to shell expansion. So I was asking on IRC channel how to do this. So you can see that it very fast ended up in a nuke explosion, basically. Very often it happened like that. So I was told that I should basically stop trying to make a code like to look like a Python code like the Besh or any other language you can imagine. Basically Besh community is friendly, but you need to learn their ways to express friendliness. Yeah, in the end it was actually possible. But I was asking for by this thing, this like I would say way. You can use declare, yeah, sorry. You can use declare Besh directive, which declares new variables and assigns values to them. Also you can specify with declare, you can specify also a type of a variable if you want to. So you can specify like minus A for hash or for associative array or minus X to make it like exported variable. And funny thing is that you can name the arguments that the function should expect here and give some default values to those arguments. And then if you use this expression which basically unpacks actual inputs that the function gets in runtime and you call it with something like this. So it will get this here, expanded. So basically declare will first declare name as B. But you from this invocation, it's actually not here, but I can pass name foo and you can override this default value. So that's really nice because that's like function header basically in the end. So I was surprised how many things you can do in Besh in the end, but you really need to look hard for them. But Besh is not so bad. I was really excited lots of times that it can do things that I wouldn't expect it that it could do. So here it prints ABC because these are default values and here bar bus foo. Okay. So what is the current feature set? I implemented all the stuff from goals and I implemented them by using spec templates. And yeah, I made it parallel safe, which is kind of fun basically. I was thinking about scenarios when you, for example, run evaluation of this spec file like you render the macros. And at the same time, someone is switching branches or adding new commits or making new text. Because then you evaluate this macro at some point and then someone adds changes their present repository name or adds new tech or adds new commit or whatever. And then this macro has different context. So it will produce some spec file that is inconsistent. And I wasn't happy that this is possible. So first I introduced something like Remembering what was the output of this macro and then reusing it in this macro, but that was bullshit, sorry. In the end what I did was that I basically set up the state at the beginning of the evaluation like certain data that I use all the time afterwards. And I, that way it is solved, it took me a while to figure out how to do this properly. And yeah, let's keep it at the end, keep this at the end, an extension of supported Git command set. So I wasn't like very thrilled when I started to use FedPKG or RPKG or whatever that it offers some Git commands, but this set is very incomplete. Like I don't know if I have FedPKG installed because I have a new laptop now. So basically it offers, for example, commit or clone, but there is no merge or lock or other things that you commonly use. So I basically tried to make this set of Git commands that the packageer can use complete so that you don't need to use Git if you don't need to do some low-level operations. Like sometimes if you want some really tricky features of Git, like for example setting merge strategy or something like that, you need to go for Git, but for normal things that I personally do every day, I can just use RPKG and that's it. So this is the end, thanks for your attention. So now I would like to hear from you what do you think and if it is a good idea and if possibly it could be used in Fedora. Yeah, right. Yeah, right. So actually I leave out this thing, user definite defined subcommands. It doesn't need to be user defined but also distribution defined. So basically this is last feature that I'm missing right now that I want to do that you can actually implement your own workflow as a packageer. So you can add connection to a build system or an update system and you can add your own subcommand that builds it and then you don't need to do this on a package level meaning that you define this stuff in the repository itself in rpkg.conf, a configuration file, but you also can do it in distribution provided rpkg.conf which is placed in ETC. So just define this command that are specific for the distribution and specific for particular build system and so on. If you look at this list of the commands, build command is not there. That's something that took me some time to decide if I want to include some build command that adds some integration with a specific build system. But in the end I decided that I want just this bare bone utility that you can extend easily and you can add what you need to it. So this is just about rpm and git and putting it together and provide a way that you can extend this for your own workflows. So by the way, if you want to join development, here is a git repo, you can write me and we can do this together. I would appreciate it because it's more fun to do it with someone else also. So do you have any other questions? Basically the idea is to create always a new annotated tag for each new release where you specify what are the changes and then release if you use the macros that rpkg offers for basically generating a release and generating version. You get that for free, you don't need to edit spec file, just make a new annotated tag and the rendered spec file will contain one version and new change log entry. I read a body of annotated tags, like when you make a new tag with rpkg, it stores the tag message into the tag and then when I render the spec file, I read the contents and render change log, like valid rpng change log from it. And you can basically, there are some parameters that tweaks the way the change log is generated. So you can, for example, leave out some tag, if it, for example, contain mistake, you can provide that change log manually and then provide the rest of the change log entries automatically. So you can leave something out and do stuff like that. Basically, the change log is the thing that generates the change log and before I had this macro, I was putting stuff manually into the spec file, so this is like manual history. But I could also, like, limit this command to only generate change log to a certain tag and then, again, start doing this manually. Or I could just not use this macro at all, if I don't want to. That's true. That's true. The name is not a very useful thing. This is, I would say, useful, or there is also attack for release, for automatic bumping. And then there is this stuff, this is quite horrible, because I have actually, these two macros, get their archive and get their pack, they are taking the raw content of the repository and they make a table from it that RPM can then use and make RPM from it. And this get their pack uses git ignore command, which is supported only from, like, git 1.8, which is not available on Apple 6 or CentOS 6, so I needed to do this expression that basically use git their archive if we are on higher than 6 version of operating system, otherwise use git their pack. But these macros basically do the packing, so you can, I work all the time with this raw sources and if I want to make an RPM from it, so I will call rpkg local and it will automatically create the tables at the beginning and then it passes them to RPM to continue. Very well, because you cannot merge text into your branch, you have a tag in a side branch, you cannot rebase it like full forwarded into your branch, it will always create like side branch. So you actually, if you want to create a new release, you should do it on the main branch that you are developing on and the side branches should be probably used for development only and not releasing, I guess. I need to look closer into this if there is some possibility, but I think there is a disillumination. Really? That makes sense. Okay. So it probably works. Actually you have to have all the changes you want to make, commit it as a tag before you want to do a local build of the final RPM version or does it take into account the changes in the working directory? Right, yes it does. Actually it is working with, if you are on your own machine and doing stuff like changing something, so what can I do? So you can see that now the version of a package is reported as dirty and if I am using git there pack macro in my spec file which I am on the system, then the RPM that I will create from this will contain that dirty spec file. But if I am using git there archive, git there archive only works on a clean tree. It never works on a dirty tree, so if I want to create a clean RPM that doesn't contain any dirty changes, then I can use git there archive and it will do it. If I am developing or testing stuff on my own machine that I can use git there pack because then I can do changes on the fly and I will see them. Actually I can use git there archive all the time because git there archive if it finds out that it works with a dirty tree, it will use git there pack and it will automatically if you work on a dirty tree you will get dirty RPM and if you have clean tree you will get clean RPM. The difference is, it's not clear, sorry I am stepping over the time. So the last thing, the difference is that if you are working on a clean tree then you get in the generative table by git archive you get hash sum of the table. So not hash but commit ID from which the table comes from. So you can recognize when you inspect the RPM later if it was generated from other tree or clean tree by expecting the target address. And you will also see it from this git there, from this VCS stack. If the tree was dirty hash commit will not be there in the URL. If it was clean it will be there, the hash commit. So you can generate something. You don't need to be afraid that you somehow end up mixing dirty RPMs with clean RPMs because there is clean differentiation between those and you can look at this VCS stack and you will know if it was clean or dirty. So that's probably it. Thank you very much for your attention.