 So, I'm Nicolas Dandrimon and I'm going to talk to you about a year of FEDMessage in Debian. So, we actually have a problem with our infrastructure in distributions. I mean, our services are a bit like people. There are dozens of services maintained by many people and each and every service has its own way of communicating with the rest of the world. I think that if you want to spin up a new service that needs to talk to other services in the distribution, which is basically any service you want to include, well, you will need to implement a bunch of communication systems. For instance, in the Debian infrastructure, we have our archive software, which is DAC, that mostly uses emails and databases to communicate. The metadata is available in an RFC 822 format with no real API, so the database isn't public either. The build queue management software, which is called WannaBuild, pulls a database every so often to know what needs to get built. There is no API outside of its database that isn't public either. Our bug tracking system, which is called Debugs, works via email, stores its data in flat files for now, and exposes a read-only SOAP API. Our source control management pushes in the distribution, provided repositories on Alioff, can trigger an IRC bot or some emails, but there is no real central notification mechanism. So we have some clutches that are available to try to overcome this issue. We have the ultimate Debian database, which contains a snapshot of a lot of the databases that are underlying the Debian infrastructure. So this means that every SOAP run, there's a cron that runs and imports data from a service here, a service there. I mean, there is no real-time data. It's useful for destroyed QA stuff because, well, you don't need to have real-time data, but when you want a notification for trying to build a new package or something, that doesn't work really well. And the consistency between the different data sources is not guaranteed. We have another central notification system, which is the package tracking system, which also is cron triggered or email triggered. There is, I mean, you can update the data from the PTS using two ways, but you can subscribe to email updates on a given package. But the messages are not uniform. They can be machine parsed. They have a few headers, but they're not really sufficient to know what the message is about. And it's still not real-time. So the Fedora people invented something that could improve stuff, which is called FedMessage. So it was actually introduced in 2009. It's a unified message bus that can reduce the coupling between the different services in a distribution. The idea is that services can subscribe to one or several message topics, register callbacks and react to events that are triggered by all the services in the distribution. So there are a bunch of stuff that is already implemented in FedMessage. You get your stream of data with all the activity in your infrastructure, which allows you to do statistics, for instance. You decouple your interdependent services because you can swap something for another or just listen to the messages and start doing stuff directly without having to fiddle with a database or something. You can get a pluggable unified notification system that can gather all the events in the project and send them by email, by IRC, on your mobile phone, on your desktop, everywhere you want. The Fedora people use FedMessage to implement a badge system, which is some kind of gamification of the development process of the distribution. They implemented the live web dashboard, they implemented IRC feeds, and they even got some bots banned on social networks because they were flooding. So how does it work? The first idea was to use AMQP, as implemented by QP. So basically, you take all your services and you have them send their messages in a central broker. And then you have several listeners that can send messages to clients. So there are a few issues with this. Basically, you have a single point of failure at the central broker, and the brokers weren't really reliable. I mean, when they tasted it under load, the brokers were tipping over and stuff. So it wasn't really nice. So the actual implementation of FedMessage uses zero AMQ. So basically, what you get is not a single broker, you get a mesh of interconnected services. So basically, you can connect only to the services that you want to listen to. The big drawback of this is that each and every service has to open up a port on the public internet for people to be able to connect to it. There are some solutions for that, which we'll talk about. But the main advantage is that you have no central broker and they get something like a 100-fold speed up over their previous implementation. So you also have an issue with service discovery. You can write a broker, which gives you back your single point of failure. You can use DNS, which means that everyone can say, hey, I added a new service. Let's use this as a record to get to it. Or you can distribute a text file. So last year, during the Google Summer of Code, I mentored Simon Chopin, who implemented the DNS solution for integration of FedMessage in Debian. The Fedora people, as they control their whole infrastructure, just distribute a text file with the list of servers that are sending FedMessage messages. So how do you use it? So this is the Fedora topology. I didn't have much time to do the Debian one. It's really simpler. I'll talk about it later. So basically, the messages are split into topics where you have a hierarchy of services, of topics. So it's really easy to filter out the things that you want to listen to. For instance, you can filter all the messages that concern package uploads by using the DAC service or everything that involves a given package or something else. So publishing messages is really trivial. I mean, from Python, you only have to import the module and then do FedMessage.publish with a dict of the data that you want to send. And that's it. Your message is published. From the share, it's really easy, too. You just have a command that's called FedMessageLogger that you can pipe some input to, and it goes on the bus. So it's really simple. Receiving messages is trivial, too. In Python, you load the configuration, then you just have an iterator that was a replay mechanism with just a sequence number which will have your client query the event sender for new messages that you would have missed, well, in case of a network failure or anything. So that's basically how the system works. Now, what about FedMessage in Debian? So during last year's Google Summer of Code, a lot happened thanks to Simon Chopin's event. So he did most of the packaging of FedMessage and its dependencies, which means that you can just apt-get install FedMessage and get it running. It's available in CID, JC, and in WISI backpots. He adapted the code of FedMessage to make it distribution agnostic. So he had a lot of support from the upstream developers in Fedora to make that happen. I mean, they're really excited to have their stuff being used by Debian or by other organizations that thought that FedMessage was the right solution for their event notifications. And finally, we bootstrapped the Debian bus by using mailing these subscriptions, so to get bug notifications and package upload notifications, and on mentors.debian.net, which is a service that I can control, so it's easy to add new stuff to it. But then, well, after the Google Summer of Code, there were some packaging adaptations to make it easier to run services based on FedMessage, proper backpots, and maintenance of the bus, which mostly means keeping the software up to date because upstream is really active and responsive to bug reports, so it's really nice to work with them. So since July 14th, 2013, which is the day that we started to send messages on the bus, we had around 200,000 messages, splits across 155k bugmails and 45k uploads, which, well, proves that Debian is a really active project, I guess. The latest developments with FedMessage is packaging of Datanomr, which is a database component that can store the messages that have been sent on the bus, and allows, for instance, it allows Fedora to do queries on their bugs, on their messages, and will give people the achievements that they did, like, yeah, you had 100 build failures, or stuff like that. So one big issue with FedMessage, as I said earlier, is that Debian services are widely distributed. Some of the times, firewall restrictions are out of Debian's control, which is also the case in the Fedora infrastructure because some of the servers are hosted within Red Hat, and Red Hat networking sometimes don't want to open firewall ports and stuff, so we need a way for services to push their messages instead of having clients pull their messages. So there is a component in FedMessage that has been created by the Fedora people, which is called FedMessage Relay, which basically acts as, well, it's just a tube where you push your messages using a zero MQ circuit, and it then pushes it to the subscribers on the other side, so it just allows to bypass firewall, basically. The issue with that is that it uses a non-standard port and a non-standard protocol. I mean, it's just zero MQ, so it basically puts your data on the wire, and that's it. So I'm currently pondering a way for services to push their messages using a more, I mean, more classic web service where you would take your JSON dictionary and push it by post through HTTPS, and then have that send the message to the bus, which I think would make it easier to integrate with other Debian services. So this was a really short talk, I guess. I hope there is some discussion afterwards. So in conclusion, this thing works, which is, well, part of the, well, I'm really glad that it works, but for now it's really a part from the Debian infrastructure. So the big challenge would be to try and integrate FedMessage to Debian infrastructure, and, well, use it for real. So if you want to contact me, I'm Holast, I'm here for the whole conference. So if you want to talk to me about it, if you want to help me, I'm a little bit alone on this project, so I'd be glad if someone would join. Well, I'd be glad to hold a hacking session later this week. So thanks for your attention. Was it this clear? You talk about DNS being used to publish SRV records. Yes. So that means, I'm interested in the details of what that means. What is in that SRV record, and how do I do discovery on what arbitrary sources are out there? OK, so the idea basically is that to bind, to actually receive messages, you need the host and the port of the sender. And if you have, for instance, several WSGI workers, you have several ports that you need to listen to. So what we do with the SRV records is basically under the domain name of the service. So for instance, let's say ftpmaster.debian.org, we would have underscore fedmessage, underscore tcp.ftpmaster.debian.org, which would point to the four or five workers that you would use to basically get the messages. So if I already know, so if I know that ftpmaster.org is something that I want to subscribe to, that's a mechanism for me getting the details. Is there something that tells me that ftpmaster.org is on the list to begin with? No, not yet. So yeah, only part of the problem is solved. There is no, well currently there is no list of every single service that publishes messages. So basically what they do in Fedora is that, well, and what we do in Debian 2 is that for public consumption, there is a component called the gateway, which will itself connect to all the message sources and rewrite the messages to send them to clients, which, well, you don't get the, I mean, you don't get the replay mechanism, because it works only for a single source. But you solve your discovery problem. But you get back the single point of failure. So you've got the Fed message IRC, which people should know about if they want to play with it. And a lot of that appears to be an ad hoc mechanism using mail for now. That's right. Do you have a list of the things which are truly providing this service right now? So the only service that has native Fed message integration is mentors.debian.net. So through mentors.debian.net, you get notification for package uploads, for package removals from mentors.debian.net, and also for new comments that are sent on packages. So this was basically done as a proof of concept to make sure that the implementation of Fed message in a native Python service would work. I'll do one more before we're not related to the time. You've got a service which is providing this, and that's something that's available. It seems like if you want to be successful, what you need is a destination, which is requiring this. Have you thought about what that might be? I could see that you could say, we're going to make PTS totally use this service, or I could also see that you'd say, I'm going to have a federated repository which is going to always be up to date. Have you thought about what that magic rail is that might be pushing people towards us? So lately, there has been some developments regarding the availability of, so the quicker availability of uploaded packages. FTP masters just started to provide incoming.debian.org publicly. So right now, as soon as you upload a package, everyone can download it within five minutes. So this will enable us to, for instance, have QA checks run directly when a package is uploaded. So for instance, people are doing mass rebuilds of packages using C-lang instead of GCC. Well, they could hook into FedMessage to actually build the packages and have the results of the C-lang rebuilds, even before the package hits the mirrors. So you do your upload, then you get your QA results before the package even hits the users, which is, I think, a pretty good goal. Sounds like a killer app. Yeah. I also think that striving towards making Debian development fun is something that we could do. I mean, the Fedora guys have done a really great job with their badges app. Let me show you what it looks like, because I think it's really great if you want. Yes. So yes, this is the Fedora badges application. So the idea is that you have every single person that interacts with Fedora in some way gets a badge, basically. So this newcomer here has two badges. Yeah, you have done some testing on the package, and you have created your account. So you get a little board. Well, yeah. So this guy is the person that created FedMessage. So yeah, I think a lot of the top people are Fedora infra people, because they created a lot of badges to show their work, basically. But yeah, so let's take some guy. Is there a badge when you own a badge? Yes, there is. So yeah, basically, when you attend a conference, you have a QR code to flash, and it adds a badge. When you do a package build, you get a badge. When you do 20 tests, you get a badge. Where are the build failures one? Yeah, yeah, there are. So what goes up must come down. So you get a badge when you fail a build. Everything. I mean, it really makes the development a bit funnier than just, well, you GPG sign something and you put it on FTP. I mean, that's boring. So yeah, I think it's a fun little thing that we could introduce. So are there any more questions? Microphone. The relay, you didn't go in a lot of detail about that, but it has to sit. You stick a small service on a machine that's not firewalled and you want to post to it asynchronously with an HTTPS connection and it has the open port so it can send and receive native 0MQ. Is there, that requires yet another place for a service to live and it's small and it's asynchronous and there's advantages in that direction, but is there also advantage, would there also been an advantage to have something like, I don't know, S tunnel or something where you can create an SSH or some other ways so that you don't have to introduce yet another location for something to sit and talk. I don't, I know that there's trade-offs there, but I just wondered if you thought about it. So, well, I think, actually, I'm not sure I can answer that. I mean, yeah, I know that we have some machines, for instance, machines that are hosted at ARM in Great Britain, which are really, really firewalled down and where you even have a hard time hitting a mirror that is outside of ARM. So, yeah, I think having something on HTTP which is a standard port might make it easier than a random port anywhere. So, yeah, I think that could do it. I think that's an advantage. Well, I guess that's it. So, thank you very much for your attention and if you have any more questions, feel free to hit me up anywhere, anytime. Thanks.