 Okay, welcome everyone, we are here to talk about packaging the free software web for the end user. So first I want to say a few things on DevCon 5, DevCon 15, I love it. So I know it's very hard work to organize and that's why I will probably change the subject whenever you mention doing one in Brazil. But the conference is amazing, the venue is really nice and the food is nice, the people that work here are nice as well. Having so many kids and families here is awesome, not a parent yet, so I don't have any idea on how hard it is to actually bring kids here, but it definitely is an extra encouragement to actually have kids and be able to bring them to DevCon. And also if you hack too much at night you might miss all the morning briefings and actually win a raffle and not be here to take your price and therefore lose it. So now back to the topic, the problem we are discussing here is that we want the web to be more distributed and more federated and less centralized. Today we have a really concerning problem that more and more people are just relying on centralized services, which we all know contribute to mass surveillance and other issues that we should care about. But on the other hand, server-sized applications are complicated. For technical people, it's either too difficult or too boring to set up stuff, so you end up doing lots of work or boring work that you shouldn't need to do, and then that's why in the last years we had a boom of configuration management stuff, so you don't have to repeat yourself every time you want to configure something that lots of people do every time. And that also means that non-technical people can do it at all, so they end up depending on either centralized services or if they can find a provider that's one of the big ones, they are stuck having to do it for the sales or finding one that will do it. And we actually have quite a bit of the free software web in Debian already, so if you search, this was a few months ago, so when I searched the web there, we have more than 3,500 packages with web somewhere in the fields that up-to-cache searches, and we have actually 92 packages that ship files under the Apache configuration directory, and that's excluding Apache itself and Apache modules, and also I know of several web applications that don't do it at all, like RedMind, which I maintain, doesn't ship anything by default under the Apache configuration directory, so I think we are quite good at packaging, but then that brings the question of what actually is packaging, is just providing the source code enough? Are we doing enough for our users? For instance, we have no standard at the moment for how a web application should be packaged and what goes where and what are the minimal things that a web application has to provide. You also have more complicated problems like you have cross-package configuration, there are some configurations that don't belong on a specific package, so you don't have a place to put them. You also have databases, databases are complicated, and then you have DNS, you have email, you have even more stuff than there is there in the slide like SSL and other things, so to talk about a little bit about my history of this topic, so since last Debian Conf, when I watched Zach's talk on the future, on the role of Debian, on the future of software freedom, and this was one of his points of the move toward dumb devices that only access network services, and then we go back to the beginning when we were in the hands of proprietary software providers, and now we are in the hands of proprietary services providers, so if people can't control the server side, then they are being controlled. And then on FOSDEN, beginning of this year, there's Lucas's talk on whether distributions are really a solid problem and there is no exciting work to do on them, and at the same distribution, there's the talk by Deb Nicholson and Christopher Weber. They work on media gobbling and they were talking about issues that come with making web applications easily available for users, and it's definitely not an easy thing. And also, there was a thread a few months back on Debian projects about this new wave of people retiring from Debian and some people argue that maybe distributions are a solid problem, there's packaging is solid, and then there was a very nice discussion on that topic and I would say that if you are looking for cool stuff to do, you can read that thread. I think some of the items may be taken already, but there are still several nice ideas there. So distributions are most definitely not a solid problem and there's we still have a big role to play in the free software ecosystem. So I started hacking on a proof of concept implementation of making web applications easily available for users back in February. Then I went to the Ruby Spring in Paris and was able to attend the MiniDeb conference in Lyon in April, so I did a presentation with very half ideas of how this was going to work and didn't have the courage to do a live demo there yet, but I got very useful feedback and this helped me to improve the concept and in the design of the application. Then I got a GSOC slot for this year, got more submissions that I wanted to or 10 students, so you have to actually rank students is a horrible thing to do and I had to do it, so I plan to change my strategy on getting some of the code submissions for next year. Thiago was the student I selected. He's from the software engineering program at University of Brazil and he did a very nice work and with a project that's completely in flux, I was changing things all the time and telling him to do the things he had already done and he never lost his cool, but he did a very nice work. Then comes DevCon 15, lots of hacking, the quite hack labs are awesome for that, so I was a few days locked in the quite hack lab and then I had to stop in the middle to be able to remove the embedded copy of jQuery I had in my code, so I went there and in practice I think I adopted jQuery now and I don't know yet how crazy it is to do that, but let's see what happens. Then I went back to hacking and today you actually see a live demo. It might break, it might not break, but let's see. The goal of the project is to automate the configuration of web applications that are already packaged in Debian, so we are not, we are not just considering all the work that Debian maintainers have made, so there are at least almost 100 applications and we don't want to waste that, so we are just making sure that that work will be able to get to the end users without large efforts on their part. An alternative version of the goal is to allow everyone to have their own Debian server in a secure and maintainable way, so in a way that's repeatable and can be, in where upgrades can be handed in a sensible manner as we usually are able to do in Debian, so it's another variant of world domination. So the application I'm working on is called Shack. It stands for self-hosting application kit, but I will accept other meanings that are cool enough. I have a repository on GitLab and if you don't want to go to GitLab you can also check out from Git about Debian.org and then my user account. Of course there are other people looking at this as well, so there is Send and Storm, why you know Host and Beachnami. They all have nice ideas and I look at them and I plan to try to implement some of them, but they all reinvent packaging in some way or another, so that's not what I think we should be doing at this point. I mean if they succeed that's awesome, but I think in Debian we can do better. So about the design, of course we are using official Debian packages with a configuration management layer on top. So the idea is that this configuration management will do the minimal amount of work necessary to have a great off the box experience and also help us figure out which changes we have to push into the packages so that they work even better for people who are not using it. So I'm going to talk about this a little bit more after. So the idea is to have a new abstraction which I'm calling an application that's one layer above the packages and an application can be about just one or several packages. So if you want an email server for instance, you want an MTA, you want an MTA, you want an IMAP server and what not. So you usually don't think of those things as separate things. You just want an email server that works. So the email application check just makes sure all of those are configured to work together in a reasonable way. And I was thinking if it made sense to mention applications that require zero packages but then even if you want a static website you need SSH or seem to copy stuff into the server. So it usually be one or more packages. I'm working to have a completely decoupled business logic from the UI. So you can have multiple UIs. So we already have a command line interface for those who like the command line. But it's also useful for automated testing. So you can call the command line interface and test that stuff happen in the right way. And also it's a requirement to have humans to use it to have a nice graphical user interface. Then we have some assumptions on packages. So they should do most of their stuff right. Not automating the website configuration is more or less okay because if you want to make it reproducible you want to have a uniform way of configuring every application anyway. So not with the website stuff is probably fine. But the application should handle their own upgrades. So there's no better place to handle a package upgrade other than the package itself. So DB config common helps a lot of this. It's very nice that the podgevers actually took over DB config common lately and fixed a lot of bugs. It was abandoned for a few years I think. And also we should thank Sean Finnage for doing the initial work and maintaining it for several years. So DB config common is now maintained and if you're not using it you should. Then there is a few nice to have. So ideally you want packages to support multiple instances. So running multiple independent sites from the same code base. And we will patch stuff for that. So Tiago actually wrote a patch based on some upstream discussions for our own cloud which is going to be present in the next version, in the next upstream version. So you can now have two on cloud instances on the same server with the same code base. Because what upstream said was if you want two on cloud servers you copy the code two times and go from there. Of course we are not doing that. And also if applications cannot be crap it would be nice also. And the data model we have is we have these applications which are the things that users see as useful things to have. Users don't think in terms of packages. And then an application is implemented by a cookbook. So I'm using the Chef terminology because that's what we are using. But the idea is that the cookbook is the code that makes the application be installed and be configured in the correct way. And the application is data. It's the user data. So I want this website here using WordPress with this domain name. And of course there is the code to handle that. But that's application data. And they are completely separate and you can make backup schemes to make sure you don't lose it. And then you can specify inputs in the cookbook. So what are the parameters that the user can provide here? The host name, the path. There are other examples in other applications like if you have a static website we have which user should be able to write to those files. And then the cookbook can specify that in a way that you don't need any change at all in the code base to display those fields to the user and have the user input, the data and the user interface. So my idea is that we will be able to add support for new applications just by writing new Chef cookbooks. And then if you are using packages, the cookbooks are mostly trivial. It's just please install the package. Please create this configuration file with these contents. And please configure the web server this way and then reload the web server when you are done. And then we have a working package. And then again, the idea is that the Chef Recypes must be as simple as you can. And you have to push everything that's application-specific into the application. The architecture is there's a repository of data which stores the user data on applications. You can use both a command line interface and a web interface. The command line interface will write to that directory to the repository and then invoke the configuration management layer to do its thing to the system. And for the web interface, we have a privilege separation there. So you have the actual web interface who writes the repository. And then you have a configuration that listens to changes to the repository and applies the changes locally. And then you can specify access control policies there. So you can let some users create stuff but not have their changes applied automatically. And then you can have an administrator user that comes later and reviews the changes and accepts that configuration to be applied to the server. So the code is written in Ruby. There's no rails. So dependency management is way easier. We are using Chef for configuration management. I'm not really happy with it because it takes five seconds to do nothing. So if you just reapply your current configuration, it takes a few seconds to figure out that there's nothing to do. So maybe I would consider changing for some other things. And also I try to automate most of the tests that I can. So the current state of the project is in alpha stage. So I have a base system implemented already working. So you can install and update applications. We have both a command line and a web interface. Removing application doesn't quite work yet. So if you are testing this, please make sure you do it in a throwaway machine, a VM or a container or whatever so that you can later remove, just throw it away. We'll be able to remove applications because we are just not there yet. We have a few applications already available. Basically, static site, which is the one I started first to make sure I got most of the things right without having to worry about databases and all this stuff. So it already works. So if you want static site to just create there, it gives you a directory where you can now sync the files to in its life. And then Tiago added support for WordPress on Cloud. He also wrote the email application. Of course, the email one needs to be looked into because, as you all know, running an email service is not easy. So we will need help on that to configure all that the Kim, Antispan, and whatever, all other kinds of stuff. We are also working to have encryption by default. So everything is encrypted out of the box. Also done by Tiago. We are using self-signed certificates for now. And we are looking forward to have viable solutions for that, to automate having proper certificates, either by the CA cartel or from any other solution that you can find. I mean, I don't think we have an easy solution out of the CA cartel in the short term, so you probably have to go with let's encrypt anyway. Now, the dangerous part, let's see. So we have here the web interface. I don't have anything yet. Can you all see that? You won't be able to read the text, but trust me, that's all sensible stuff written there. I think I can. Yeah. Right. Yeah. OK. Yeah. So then it will list here the applications that are available. We've had icons and all kinds of I can do there, but we're just not there yet. You can install, let's go for the easiest one, static website. So you'll tell here the host name, which user should be able to write to the static website. And then it's applying the configuration on the background, and then configuration applied. And you can just open here, accept the self-signed certificate, and then you have your static website ready to push. You can copy this rsync command line here, and then you have a website online. And the awesome thing is that the easiest thing to put online is actually the most performant website you can have with static files. We can look at the command line also. Let me go back here. Just log into the VM here. I guess white background is better. So I don't have enough columns to show the table, but it automatically invokes your pager if you don't have enough columns. So here you have the status of the application. You have a status here which says whether this configuration is already applied to the server or not. You have the link. You can go from here and click in Access. You can also install stuff from here. You can also install stuff from the command line. It uses just the very same code that is used in other places. Now you have two applications, both up to date, and you can click here and go to the... Wow, there's a type over there. So I can also demo updating the application. I can just either type on the domain name. I can just change it. Localhost, right. Then to apply again, yeah, three seconds. It was always yesterday. Then you can list the applications here. And then you have... Yay, the demo doesn't work. Maybe. Yes. I mean, I don't know. We don't know yet. Nobody, it headwracks everything to a GPS, but yeah. I don't know what's going on there. So you see that there's a few half edges to polish, but yeah, that's just the default HML. Somehow came from Apache, but it's actually running in GenX. OK. I think the problem is that we are not... We are reusing IDs, so I deleted all the test data I had yesterday, so we think, well. But at least you can trust me that it works based on the initial demo here. So what you did there is also here. It's all the same code. Everything is OK. Is it the same? You'll be able to use this from small screens. So if you're using a tablet or a cell phone, you can also do your thing there. Yeah, I think that's it. So it's just influx, and we have to fix these things. I think the problem is that we are reusing the IDs. We should not be doing that. Now talking about the future. We have reasonable amount of time. So next steps. Obviously, we have to upload this to Debian. I have the packaging mostly done. I just need to figure out a better way of not having the web interface running at all times and use socket activation and that kind of stuff to start the service. And also a secure way of letting users do the first access. So with SSH tuning or something. I'm not sure yet. We also need to improve the web applications policy. I knew, I think just after he became DPL or just before, I don't know, he added me to the IELOT project to work on that, but I didn't have any time to do that yet. We have to, I mean, Tiago's graduation project is going to be related to this. So we are trying to figure out what things we should mandate applications to have and what are the common patterns when you want to configure applications in a consistent way. And then I think that's going to help with that. And hopefully, I think that's quite a nice graduation project if you can write the web applications policy. I would give you a degree for that. And also integrate more package. So we only have four, on-clouds, WordPress, static site, email. I think we have RedMind mostly working, but there's batch spending for Passenger and Nginx, and that's it. Also, we need to figure out a way to easily bootstrapping. It will be mostly installing the packages. But then, again, for non-technical users, it's not so easy to just log in with SSH 207 and install a package. Also, providing pre-built images, we just have an image of checking sites. You'll be able to do everything that you need. And then figure out how to get that to people. Other ideas is being able to spawn new servers in the cloud if one click in a reasonably secure way. And also, of course, it's maybe even more important than in the cloud is to have pre-built images for common low-cost hardware. So we'll definitely be talking to the FreedomBox people. I just didn't do that because before this week, this project was pretty much vaporware. And I didn't know to ask people to consider vaporware. So now that's proper software, we can start to talk. So how we can help? I hope you all want to help. You see that there's lots of work. Those items there that I mentioned are not easy at all. We do need collaborators. Because I also do a few other things, and I'm probably not able to spend all my spare time on this. First, we want you to request your package to be added. So if you're maintaining a web application that you want it to be added, please talk to us. You can open an issue on each lab. And as soon as the package gets into the archive, you can also use the BTS. We don't care. We'll be looking at both places. We also have other ways of helping. Usability testing is going to be very important if you want people with no technical skills to be able to use it. We will need lots of that. Bug reports, documentation as well. So I don't know how much of the aspects related to the difficult stuff, like DNS, SSL, and other of those things. We are going to be able to abstract away from users. So we will be needing to explain that to the users. So documentation is going to be a really big thing here. Also, we will have translations. I'm writing the code with mock get tech support. So we have BL files for you guys to translate. Of course, I'll take code and patches. I love to have code reviews, ideas on how to handle the difficult stuff, DNS, SSL, email. We have the infrastructure working. We just need help with the best way to solve the difficult things like anti-spam, checking signatures, and making sure that emails don't get on the spam boxes of the people they are intended to and all that kind of things. And our security audits are going to be very important. So we want the configuration management code that we have to set up the system in a way that's going to be safe for users, and in a way that's going to ease upgrading and make sure people can benefit from the incredible security support we already have in Debian. I didn't ask anyone about this yet, but I think we can use the resources that are already there and mostly not used. So the Debian Web Apps IRC channel has three people the last time I checked it. So if you want to discuss this, I would encourage you to join there. And also the Debian Web Apps list on list.debian.org is also mostly dead. So I think we can go there if people start getting annoyed by us, we can just find another place. But I think we should just reuse what's there already. We are six now. OK. We are six now. That's better. Now I'm happy to take any questions or suggestions you guys might have. So hi. First of all, thanks of all for taking the lead on this work, which as I think you know, I think it's very important for the future of Debian and for user freedom in general. Then I have a question. I think part of the mess we have around packaging web applications and more generally making it easier for users to install them is not only our fault, but it's also some kind of lack of diligence in upstreams. And Debian in the past have been giving guidelines to our upstreams on how to make software easy to package, but in general, how to better end all of this management and all this kind of stuff. So for instance, we have wiki.debian.org slash upstream guide, which is full of very useful suggestions for our upstreams. So I was wondering if you were also considering as part of this work to actually propose guidelines for upstreams of web applications to actually make the situation better, not only for us, but for everyone else. Yeah, I think that that's a very nice thing to have. And there are a few specific things that are specific to web applications, like people adding tons and tons of dependencies and then requiring almost 250 packages to have GitLab in the archive, which we are quite close to have actually, but then we had like one or two years of packaging dependencies. But yeah, I think that's a good thing to do, and I take note of it here. Go ahead. Hi. First of all, I never would believe I would say that, but I think that PHP is less crazy than Ruby. Ruby on Rails packaging. So the question is, I still consider you our Debian Ruby God, and I love the gem to depth. Do you plan to tie these two works somehow together so we can get something like GitLab packaged in a semi-automated way by pulling the gems, running gem to depth on top of that, something like that. Is there something ever crossed your mind, or is this completely, well, crazy? I think we are, OK. So on PHP, Ruby is crazy than the PHP, and that depends on your definition of crazy. And on GitLab, so we have most of the dependencies package now. I think we are missing five to 10 packages. And Pravin, who is working a lot on this, he actually got GitLab company to fund his work for two months to work on that. I would hope to get a proper GitLab package very soon. I think that's going to happen, but it did cross my mind doing a Debian package with all the dependencies inside, installed by some crazy means. But then I don't think at this point it's useful anymore to do that. I did that for work with RPM package. So I have one big package which is all GitLab dependencies in one RPM, and then another package that's just GitLab and depends on that, using Bundler and all that stuff. It mostly works. But of course you have one package with, I don't know how many libraries inside, and you don't want that for the proper archive. But I think we will have a proper GitLab package very soon. And hopefully we'll be able to back port that to Jess as well. Well, so additional questions. So how do you deal with the worst trick requirements usually, well, inside the gem file, and the often breakages in the API and in the Ruby, between even the minor Ruby package version, because, well, I'm sure you remember Ruby Rack, which completely broke between 1.4 and 1.5, I think, or something like that. So this is still a problem, right? That doesn't quite happen as often as it did in the past. So the problem with Rack in the past was that Rails actually used some things from Rack that was not supposed to use. And then when the new Rack got in the archive, then Rails broke. But today the problem is not so big. And what I do in RedMind is I just patch out the strict requirements. Usually the requirements that upstream specifying their metadata are more strict than necessary. So in upstream, library maintainers are not so crazy anymore these days. So it mostly works. Of course, they are on a case. But in RedMind I do it invagrant. I relax lots of dependencies and it mostly works. Related question. Have you think the other way around is it would be easy to make Debian packages with your package system? Is that just a matter of running the cookbook in the post-ins to get a Debian package? Because I'd like to have all the source from Debian where it's checked and maybe reproducible soon and everything. So I don't think we should do that. I think the Debian package should be good enough to be used on their own. If you are an assistant administrator and you know what you're doing. Because if you do that, if we start doing packages using stuff that's not the standard packaging tools that Debian has, then you are creating non-Debian. And we don't want to create non-Debian. We want to put the existing Debian to good use. Just a note, there is such a thing as a web app Debian policy. It's a bit high, but it's one. There is one, but it's out of date and nobody follows it. So the point is making sure people know about it and that people follow it. And that we file bugs if they don't. So this is maybe slightly along the lines of the last question. So as I understand it, you have the metadata and this Ruby cookbook stuff. This is your Chef cookbook stuff. That's all in the Shack Git repository. So you've got the metadata. Effectively, you've got the metadata and the glue for every one of these web apps is defined there. And so there's a dependency then on a particular versions of, presumably when you do that, it automatically runs apps or something to install the package. So I can see why that is a natural design. Have you considered the alternative, which is to allow packages or perhaps meta packages to provide the metadata and the glue themselves and whether that would be better or worse from who's maintaining it point of view and a practical point of view. Yeah, that's possible. So as I said, the cookbooks are completely decoupled from the code. So if you just install the cookbooks to the right place, the two will see them and you'll be able to use them. Right. But of course, then you need to install the package that has the cookbook in it first. So whereas at the moment, you have your shack package, which contains all the cookbooks. So the shack thing itself with the cookbooks is that has a Debian packaging, right? Right. OK, good. Yeah, it will be in the archive and it will be maintained as any other package. And then the idea is that when we get a stable release, we don't need to change the metadata and then the applications will be able to just upgrade themselves in a reasonable way. And then there is the concern of what happens if the application upgrades and then the configuration manager calls the handles that needs to change. So the idea is that if we are unstable, that's going to change anyway and then we are going to fix it. But then once you have a stable release, then we will try to make sure that that configuration management works for the eventually and stable. And then we can maintain that. Right, and there will be some kind of upgrade path for existing users. Excellent. Well, keep up the good work. Thank you. I think there should be time for one more question also. Do you have any plans for interactions between applications, for example, thinking of a client for week cards or a client for a calendar that might rely on another service that's doing the basic calendar infrastructure? Will there be dependencies between those shack applications? And how would one configure that so that one application just finds the other application? OK. Yeah, so you can do that in Chef. And I would assume you can do that in other configuration management tools. And also you can have one application that's actually several packages. So you don't need to have a one-to-one mapping. So your calendaring thing can depend on everything that it needs. Yes, but that service might need to be used by other applications as well. Yeah, that can be done. We are just not there yet with actual need. But as soon as the need comes up, we can think of a way of doing it. Thank you. I forgot one thing. I wanted to thank you for this work. And if you need something related to DNS, just ping me. I might have an idea or two. Thank you.