 Hello everybody. After lunch, some of you even heard about containers. Many people probably understand what they are. So let's give a hand to Matias, who will explain how can we deal with them in depth. Thank you. My talk is called Software Bundling Sucks, and I hope we can have a discussion in the end on why, how we could implement it in a way or how we could use it in a way that it sucks less. So first of all, to set apart, especially containers from the approach that is bundling, what is bundling actually? Bundling is an approach to make software runnable without recompiling on ideally multiple distributions by embedding all the dependencies which are required to run that particular software. And that unfortunately includes low-level stuff like Lipsy as well as toolkits like GTK or Qt. Especially because you cannot know what stuff is present on the target distribution. Unlike containers, it usually doesn't involve to include a full distribution, and it's also more tailored to GUI and desktop environments. So the current focus of most of the bundling stuff is to run stuff on Linux desktops and not to run server things, which is what Docker is mostly about. So how are applications currently distributed? There are a few things that make a Linux distribution not ideal for distributing software, and that is mainly that a Linux distribution like Debian is an operating system and applications. So we have things which are clearly part of the operating system like Linux kernel, Lipsy, Compilers, and even things like toolkits. But we also have the applications like Krita, GIMP, Shotwell, or whatever. And those things follow different, yeah, have different goals and different ideas on how software distribution should work. They are on the one hand right now the, wow, it's really slow, sorry. They are the upstream developers right now who create some awesome software, put it, yeah, make the software source code available, give it to a distribution packageer who then packages it for a distribution. And all these different entities in this delivery pipeline have different goals and have different standards they apply to the software. Like the upstream project or the project upstream who develops an application wants people actually to use the latest and greatest release and also does not care that much about back porting stuff to a certain upstream version. There are some options which have long term releases, so they can back port stuff for this, but this might not be the reason that end up in the distribution and end up being supported by some distribution like Debian. Also, they want to have people to make, they want to have people quickly install their stuff in order to test if a certain bug fixes is working or, yeah, to test the new builds on our distributions. And they do not want to create a depth package or an RPM package and test it on all the different Linux distributions that they are out there. The distribution packageer, on the other hand, cares about system integration. So if there's an upstream software published, the distribution packageer makes it match the policy requirements that a specific distribution has and also cares a lot about integrating it well into the system. So this is what we are all familiar with, I think. And, yeah, the distribution packageer also backports stable fixes and maintains the software independently from the original upstream in the distribution. And the distribution itself ideally should be rock solid and, yeah, users do not want accidental regressions. For example, their printer not working because there is something changed in the kernel. Some people use rolling releases, of course, which don't have this problem. But, yeah, rolling releases are mostly something for more experienced users who know how to deal with potential breakage. So it's not really a solution for, especially not for enterprise environments where you really don't want this kind of flux where you cannot rely on things being stable and not changing and have a target to develop for. So, yeah, and the distribution obviously doesn't change often. So you see there's a clear conflict between upstreams wanting to push things to the users faster and the distribution basically not allowing that by its current scheme. So why do people bundle things? It's mainly because they want, obviously, new software which is not in the distribution repositories, get new releases out, and also there are some smaller goals like testing a bug fix, upstream provided quickly, and also some goal that distributors themselves would have less work if they do not need to package the world. So, obviously, we cannot package every single application out there which is published in source code. So if upstreams could make it available directly to the users, we do not need to waste time on creating software or creating distribution packages for every single application out there. So now if you look at this, you might ask why do upstream projects not use Debian or Retard package manager repositories to deliver their stuff? One of the biggest reasons is security because if you install a debt package which is basically designed for deploying an operating system, you run stuff as route at install time, which is something you really do not want if you get something from a less trusted source. It might not be that there is a virus in there or some malicious software, but that the person who created that package simply didn't know how to write proper scripts and therefore has something in there which might break your systems later. Also, these additional packages break distribution upgrades quite often. So if there's some local package installed which the distributor, so we do not know about, it might lead to collisions when you try a dist upgrade. And this is something no user wants and especially not the distributor wants. Another thing is that, of course, Debian RPM repos are distribution specific. We see right now that there is a lot of stuff packaged for Ubuntu and a lot of software vendors advertised. We support Ubuntu and Ubuntu only. And yeah, this is an issue if you run a different Linux distribution. It's even not possible sometimes to use Ubuntu packages in Debian and it's even more annoying for users of Arch or Fedora. So, yeah, if you have something which just runs on every Linux distribution, you can, yeah, well, avoid these problems of Ubuntu centrism. Also, as a software vendor, you can target a really huge market of all Linux distributions while when you are just producing a debt package for Debian, you are targeting the Debian user group or the Debian users which is a much smaller percentage of all the Linux users. So, especially for commercial vendors, if they have stuff which they can say, okay, we support it on all Linux distributions, it's much nicer if they have a larger market which they can publish their stuff on. Additionally, Debian RPM repos are quite an overkill if you just want to deliver an app because the Debian package format and also RPM was designed for, yeah, creating a distribution and creating an operating system and not specifically tailored to distributing just applications. So, they are a bit more complicated than they would need to be. Obviously, one could adjust the tools in order to make this easier but you would still have the security issue and the distribution specificity. So, it's actually, in my opinion, not worth it. So, these are the problems with traditional way of creating packages to deliver software but obviously there are also a lot of problems with the bundling approach. Ironically, security is one of them. Upstreams now would need to ensure that they are not only fixing bugs in their software and updating their software but also updating all bundle components. So, in case something has bundled OpenSSL and there's a security update, this particular upstream needs to be aware of that and update OpenSSL which is currently handled by the distributors and upstream basically doesn't need to care about stuff breaking in or being bad in other components that they use. So, this is something upstreams would need to take responsibility for and from experience we all know that this often doesn't really work. Also, disk space is an issue because these extra copies of software which you already have on your Linux distribution require a huge amount of disk space. So, there's some kind of data application needed to make this problem less prominent. Also, we as distributors do a lot of QA on software which this upstreams do not do because they don't have the knowledge or do not do because they don't have the time or because they don't have the infrastructure to do it. So, this would still need to be done by the upstream projects, especially license checks which we do in Debian and quite extensively and annoy upstreams with. This is something which upstreams would need to care about a lot by themselves. For an example would be the OpenSSL exception in GPL code, so far nothing happened but it's a well, it's a legal issue which upstreams would need to take care of. So, and another issue of bundling is the system integration because if you have a different version of GTK bundled with your application which uses different, yeah, different theming APIs or different way of theming things, then you use a newer version of this in your main operating system and the user changes the theme to something, well, less well supported, then you will have this one bundled application looking completely different and behaving completely different from anything else you have on your system. So, the integration is really a huge issue with bundling and with bundle applications and yeah, there are some ways to work around this but yeah, it's a tricky bit. So, one thing to keep in mind about bundling is that actually a bundling will pretty much always happen. So, no matter what we as Debian do or if you discourage it and say, don't bundle stuff, bring it into Debian or whether we will just ignore it and say, yeah, well, bundling basically doesn't exist in our view on the distribution, upstreams will do it because the advantages are so valuable for them that they, yeah, will actually want stuff bundled. For example, there are many commercial software vendors like Matlab who create their own bundles and ship them because they want to target a larger Linux market and want to make installations a bit easier for their customers. So, especially in the field of proprietary applications, this is very prominent. But even for open source applications, a bundling has a lot of advantages and they are already embracing it. So, ignoring it won't help. Another thing is that the presentation is really slow, that bundling solves problems of the Linux ecosystem. So, we as a single distribution cannot solve them on our own. One problem with Linux is that you cannot really rely on anything being present on the system. You cannot rely on system DAPIs being there. You cannot even rely on bash being the default shell or bash being available at all. You cannot rely that there is a certain version of libc or libstandard C++ available because you might have used a newer compiler and therefore, which isn't compatible with the old version of the distributorship. The only thing you can rely on is that there is a Linux kernel, but even then you have the problem that there might not be all kernel features enabled. So, in order to really target, to really catch all distributions, you need to bundle, unfortunately, a lot of stuff to make it work. And yeah, a single Linux distribution saying, okay, we standardize on this API and keep the ABI stable for these libraries won't help because, yeah, it's just one distribution and this is a problem of, well, the Linux ecosystem. Well, in the same, yeah, it's also worth saying that this flexibility and this way to change everything and every single bit of the stack is also one of Linux biggest strengths. So, it's not something people actually want to solve because the flexibility of Linux is one of, yeah, one of the reasons why it's so successful and you can use it in so many different areas and you can tailor distributions exactly to your needs. So, yeah, this is actually an issue which cannot be that easily solved. So, how do the solutions actually look like? How do the bundling solutions look like? In the next slide, I want to go through a few of them. At time, yeah, at least six different bundling systems exist. This isn't actually true because this morning I learned about at least one more. And this also doesn't include the container stuff. So, if you have the container stuff, so rocket and docker, it would be even more. And if you include things like that, other deployment methods use like Chef's omnibus packaging and things like that, it would be even more. So, this problem has been solved a lot of times in many different ways and in many different grades of awesomeness or crappiness. So, one thing worth mentioning is that all the bundling systems which exist today are different from each other in the technology they use and the philosophy they employ. Some might say, okay, we do not want this particular thing to be possible at all, where others embrace it and say, yeah, we want users to be able to do this. So, there's not only a technology boundary, but also a policy and philosophy boundary the bundling systems have. So, let's start with the first bundling system, which is App Image Kit, which is very popular right now. And I think it might actually be the one where most bundles exist, but since it's hard to count and Snappy has also a lot of stuff in the store, it's something you shouldn't use to measure the popularity of bundling systems. So, for example, if you want to see an actual live App Image Kit bundle, you can check out Krita, who recently published one for them. So, what App Image Kit does is basically their bundles are ISO images which have a small bit of an elf header in their header. So, this thing, App Image Kit bundles are actually a normal ISO image and an executable, which you download from some web page, make it executable and just run it. So, this will mount the image somewhere and run the application inside. Those App Image Kit bundles, of course, have the virtue that you can offer users some executables to execute on Linux on their home page and make them easily run it without any installation step and without anything additional they need. Yeah, it also has, of course, the disadvantage that you can do that because you might not want users to be able to execute that stuff in their home directory and it's also a bit harder to achieve system integration. Another thing is that it bundles all runtime data, so you might have a really huge executable to download. So, I'm waiting for it. Yeah. So, but App Image Kit is developed by an independent developer and it also doesn't really require any integration by the distribution. While all the other software bundling solutions need the distribution to make them available in some more form in the distribution repositories, App Image Kit is pretty self-contained and you can just download to any Linux distribution and do not require them to actually package App Image Kit. Yeah, it doesn't have deduplication as of now and therefore the stuff is relatively huge. So, moving on to Snappy. Snappy has a service running called Snappy, which handles these snappy bundles, which are basically SquashFS images, also containing all the application data and the runtime data in one big bundle. So, they are actually bigger than App Image Kit bundles for some applications. Yeah, but you have to snap the service, which is managing them transparently in the background and placing them in the red directories, registering them with a system and, well, working for system integration and also communicating with a snap store. So, you ideally do not download any snappy bundle from the vendor itself, but you go to a store where the software vendor has submitted the Snappy bundle to and download it from there. So, there's a certain level of trust involved in this concept. Yeah, so Snappy, as you might know, is primarily developed economically and at a time very Ubuntu-centric. I'm saying this because the tools to create snappy bundles are not really available on multiple distributions and, yeah, building them on Ubuntu is at a time like the most supported solution, which isn't bad per se, because obviously canonical development for Ubuntu and on Ubuntu and what I cared about making it work on Ubuntu first. And, yeah, right now the Snappy bundling system is also coming to other distributions, including Debian, so people can have it as a really cross-distributional app store. As of now it doesn't share runtimes, but, yeah, automatically garbage collects the snappy bundles if they are not used anymore. So, you have really huge and huge, yeah, you have really big bundles to use. It also has a sandboxing concept, so Snappy uses some kernel features like C-groups to constrain the application, if possible, and in order to shield the host system from probably malicious applications. Yeah, there's another, another bundling system which has quite a lot, which has quite a strong sandboxing application, sandboxing implementation, and that is Flatpak, which was previously known as XDG app. Flatpak is the first bundling system in this series which splits, actually splits, runtime data from applications. So, you have your application in a separate, well, bundle, and the runtime in a separate bundle. So, what software vendors do is, when they develop their software, they know what dependencies they have, and they pick a runtime which satisfies most of them. So, if I develop a new GNOME media player, for example, and see, okay, it uses GTK, it uses maybe VALA, maybe something else, I will likely pick the runtime the GNOME project produces as an independent vendor, built with their SDK, which is the corresponding development files, which accompany a runtime, and then ship the results as depending on this specific runtime published by the GNOME project. Those Flatpak runtimes are essentially, basically operating systems without kernel. So, they contain libc and all the stuff you would expect in order to need, in order to build the software and in order to run it. So, they are created, well, in a similar way, they are created by some, by the Yocto project, or by tools created by the Yocto project, and usually shipped by some different entity than the application. So, the application vendor just picks one of the bundles from the GNOME or KDE project, or any bigger entity which publishes those runtimes and just uses it. What advantage of this concept is obviously that you can update the runtime as long as you do not change the ABI. So, if there is a security issue in OpenSSL, and OpenSSL will highly likely be in the runtime, the application doesn't need to care about that, because the vendor who created the runtime will take care of this and update that piece. So, Flatpak was initially designed by Alexander Larsen at Ratchat, and yeah, it's also at the core of Flatpak is a technology called OS3, which is really cool in its own, and which actually deserves its own talk. But yeah, the main thing you need to know about this is that it's very powerful for deduplication. So, if some application ships the same file and if some you might have three runtimes or more, if those runtimes ship the same files because they have the same GLC version or whatever, those get deduplicated and you do not waste that much space. Flatpak also has a very advanced sandboxing concept which also involves changing tool kits in order to request the desktop environment to open a file for an application, then pass the file descriptor of this newly opened file into the sandbox and therefore allowing the application running in the sandbox to only open that file. This way you can restrict the Flatpak application's access to your home directory, for example, and have it only open the files which you selected in the file selector running outside of the sandbox. So, this is really cool and yeah, I'm not sure, this is probably the most advanced sandboxing concept for desktop applications which exists so far. Then going on to Limba, which is my project, which I do not actively pursue right now, but it's very interesting concept-wise, so I include it here. With Limba you have the runtime split into different parts and every software vendor and well every vendor of any software publishes its stuff as a Limba bundle in this concept. So, the OpenSSL guys would publish their own Limba bundle containing just OpenSSL and accompanying SDK containing the development headers and the Qt project would build a bundle which contains the Qt libraries. And then if you build your applications you set dependencies on this particular runtime components and say, okay, I want Qt and OpenSSL and well something else I need to run my application. And then Limba would ensure that these dependencies are satisfied and that stuff doesn't get upgraded to newer versions if it breaks ABI. Obviously, this requires that the independent software vendors do not break ABI without telling Limba that they broke ABI and also do not have behavior changes in it. So, it's overall a much more complex concept, but it ensures that software gets updated to the maximum level if it doesn't break ABI and makes it possible to split the load of maintaining that stuff because every upstream would maintain its own package. This obviously also doesn't work if you don't have a central service to act as some kind of protector of this system to tell you, okay, there are currently five applications depending on an action version of GTK. It might be useful to supply more security updates for this one or to simply drop these applications from the store and tell them to please update the GTK version because that one is unsupported. So, you would need something analyzing which stuff depends on what in order to make full use of this concept. So, it's necessarily has to be a bit more centralized service to make it work. An analogy would be, for example, Python's pip service, which has a very similar concept. So, you can also think of it as a more meta distribution like. So, yeah, this is what I said, basically. You need the tools to check ABI and API. Because it's for the upstream project, it's a bit more difficult to create limbo bundles because they of course can stuff everything in one bundle, but concept-wise limbo wants you to create these independent bundles and create a modular runtime out of them. So, creating limbo bundles is a bit harder, especially because the tools reject to build a package on any mistake you made. So, creating the bundles and make them comply into the policy set for limbo is a bit more annoying. So, this is one of the reasons why it didn't gain that much traction. And also, the complexity of the dependency stuff is quite huge if you think of a lot of projects publishing it and of, for example, C++ compiler ABI changes, which might hit this concept. So, right now, I still think it's a good project, but I demoted it to a research project and want to figure out some of these issues first and see if this concept can go somewhere. So, and maybe also make use of flat-pack bundles. So, the thing to mention here is you might not think why don't we just create one bundling solution which fits all purposes and works for everything, etc. The problem here is trade-offs. For example, there are a lot of design choices you want to make when creating a bundling system. For example, do you want people to allow people to install stuff into their home directory or should everything go into a systems directory? Or do you want to, yeah, do you want to have a split runtime or do you want to have a runtime which is like, which is bigger and provided by an independent vendor? Or do you want to just allow one runtime to exist and do not allow different runtimes to be available at all, etc. So, because of these trade-offs, right now there is really no way I can see that you can unify all of these different approaches, therefore creating one solution to fix everything, yeah, won't really work. So, what's our role as operating system developers and as Debian developers when, with all this new bundling stuff being created and overhyped on the Internet? I think one of the most important things is that we should allow bundling to happen and do not blame people for doing it. I saw that there are quite a lot of people saying, yeah, it's so ugly that we bundle this stuff, we have it in the distribution already, so why don't we use it? Yeah, there are a lot of issues with this that are outlined, so, and people really want to solve an issue with this and want to scratch in it, which they have, so we should allow bundling to happen and don't really reject it that much. Another thing that we could do is to advise people on best practices to distribute software. For example, in Debian we rebuild everything with hardening flags, which upstream projects we might not know about, so, yeah, we should maybe provide documentation on how to bundle properly, so how to employ the quality standards we have as a distributor and use them for the bundles that upstream developers create. This goes, yeah, additionally, one could make QA tools like something like Lintian which checks for our policy on bundles as well, and long-term it might be useful to offer a Debian bundle repository which reflects our values we have in the project, like being all free software and, yeah, being matching a certain level of quality where we can say, okay, these are good bundles, these are good, this is good software and we trust the upstreams, you can install it in the system and, yeah, basically it won't, you won't run into trouble with that. Also, we should check that the operating system works well as sandbox applications, but this is more in the area of backfixing and also maybe create a trust path from Debian to trusted bundle repositories which goes with the with the fifth point in the slides. So, this is the, this is it basically, do you have any questions? Any questions? Okay. Hi, thanks. Do you know if the SNAPI project has any goals to do the sort of GTK and integration that Flatpak is doing or is that, is one of them just for servers and one is just for desktops or or both or anyways? Yeah, that's a question that's frequently asked. So, Flatpak is primarily designed for desktop applications. Right now, I don't see a reason why you couldn't run server stuff with it, but, yeah, its main use case is really desktop application and that's what the developers are working for on it. While a SNAPI is, was also designed for web applications and for server stuff, so a SNAPI is a bit more broad in scope, same goes for limba, while app image kit was also just designed for shipping desktop applications, which doesn't mean it could also run or server stuff. So, the open to touch apps which are kind of well that graphical rather than desktop on the touch interfaces, they're meant to migrate to SNAPI technology, but it's not done yet. So, but there are public projects and public, what is it, roadmaps to have that enabled. I'm not sure how much desktop P integration that provides. Yeah. All right, yeah, they recently published something to make integration work better with GTK and Qt on their, on their stack. I haven't looked into the details yet and what exactly this thing does, because this is a very tricky issue and it would be awesome if SNAPI solved it somehow. Hi, changing subjects, and actually we talked about this in the pub, so perhaps it's more interesting for the rest of the people in the room. There's definitely scope, I think, for distros to sort of take a more active role in all of this stuff. So, for example, there's a danger that if every single upstream, or like a lot of upstreams, especially the interesting upstreams for like our default desktops, like if, you know, Gnomes getting on board with this stuff, maybe XFC in future and all of the other ones, KDE and stuff, if they start to ship all of their apps in this way and, in fact, if the upstream starts to prefer that the users get the apps in this way, then it's consequences of that for the people that work on that stuff in the distributions is kind of interesting, right, because right now we are really the sort of most important and really the preferred way that people get the software, because this bundling things aren't mature enough yet and there to date hasn't been a good enough way for people to get the stuff, right, but if that starts to become the de facto way, then it's sort of interesting because the role of the distribution packages of end user stuff becomes a bit more, a bit diminished. So, I'm wondering, like, if we're happy for that to just go away, or like maybe there's an interesting way that distros can be involved in this stuff, like, because if there's a proliferation of, in flatback terms, if there's a proliferation of bundles, like, if every distro makes its own bundle and then makes its apps available on its, sorry, runtime, on its own runtime, then it's not exactly a great situation for there to be like n runtimes, times the number of distros, and then upstreams runtimes, and then like it becomes a confusing story for users. So, I'm wondering what all of this means for people who work in distros on end user applications if upstreams start to think that this is the best way for distributing their stuff. What, do you have any, like, initial thoughts there or not? Well, not much change after our pub discussion, actually. So, I think... Well, maybe anyone else in the room wants to come in. I think the distributors need to take an active role, because, yeah, this is the only way we can shape this future, because I think that upstream projects, especially KDE and GNOME, will make this way of distributing stuff a default or at least one of the preferred ways for users to get new stuff. And therefore, we need to see what we can do as a distribution to make the best out of it. But, yeah, it puts the people maintaining application packages inside Debian into an awkward situation, unless, of course, we are thinking about big enterprise environments where they, yeah, value the additional security support this gives and the very tight integration with the main system. But, yeah, for the average user, I think that pack might be the default. We have a few comments from RSC. Ashish says that the standstorm is maybe another bundling system, and that SPK packages are, anyway, when combined with default packaging tooling. So, he was... That was a while ago, he was commenting on the fact that standstorm is also a bundling system. And another comment by M4R, it's quite long, sees that I see large security issues with the app bundle concept. I'm rather sure that the most upstream in practice won't simply be able to take care of security issues and dependencies. Maybe due to time or knowledge restraints. If the dependency bundles are provided externally, Knaum or KDE, they're still the question of how long we'll take, we'll be taking care of security-wise. One of the selling points of app bundles is that they can rely on bleeding-end stuff at the time of the release, but this bleeding-end stuff often doesn't have a stable API. So, who is going to actually take care of keeping order of the older API version of the dependencies security supported? Yeah, that's actually the reason why it sucks and why the talk is named that way. Because, especially with the Flappa concept, we might end up with yeah, one Flappa runtime prognome release, so two a year. And yeah, I don't think Knaum will maintain them all for a long time, so we might end up with some applications. Yeah, sandboxing. I had a discussion with some people and they said basically sandboxing will fix it and there will be no security issues because we sandbox the stuff, which is a bold statement, but yeah, it's the problem. That's why Limba was designed that way and yeah, I think it might not be, it might not work to have to defer all that updating stuff to upstreams. So there needs to be compromise in some way, which Flappa took by splitting out the runtime and making at least some stuff maintained by others who know about how to do it properly and yeah, applications bundle the rest. But yeah, I really can't say anything to make that user feel better, unfortunately. Hey, how do you think Limba in particular differs from a standard Debian with lots of different library versions packaged? And is that maybe like some sort of compromise that we could do like providing better tooling around managing multiple versions of library and making it easier to have the right version available? One thing Limba does is that it also has a certain level of sandboxing involved and that it doesn't run scripts at install time. So it doesn't have all these security issues that traditional debt packages would have. So this is one thing and the other thing is that the dPackage packaging system isn't really designed for doing that, what you propose. There's something called there's a distribution called NixOS which does that which basically allows many different versions of different packages to be installed at the same time. I think morphing Debian into something like that this is an interesting idea but yeah, I don't think that it will work or get enough support by Debian developers. Okay, any more questions? If not, thank you very much. Let's thank speaker.