 So we have a Buddha Jan, he's one of the KDE system administrators and he's gonna talk about All those things that system administrator do which I have no idea because I'm not one So let's give him our warm welcome. Thank you Good and tag. I think that's how you say good afternoon in German all right, so Have you tried turning it off and on again? If anyone of you have ever watched the IT crowd, you know what I'm talking about So, yeah, this is I mean, it's not quick. It's an hour-long presentation But it it's a look at what IT is up to in KDE So this talk was supposed to be it was supposed to be a Dway talk with me and Nicholas but Nicholas is still in Argentina. So it will just be me Okay, so Hang on Right. So, yeah, who am I? My name is Bodhan Gupta. I have I'm a fairly recent member in KDE. I joined I started developing about 1.8 years ago, but I've slowly gravitated towards being as a sad man and I'm still a undergraduate student. So I'll be finna. I still have a year. I still have a year to go and like get my degree So what do we have on today's menu? interesting This is actually not all of it. There's so much stuff that we are doing and we want to tell you that This is majorly what we're going to be talking about So I'll I'll take you through the kind of servers that we have. It's not a comprehensive server inventory But you'll get an idea of the kind of hardware we operate on there's Fabricator migration fabricators. I mean, we're trying to use fabricator do to replace a whole bunch of tools and We'll talk about the progress of that migration and where we need help where what kind of bugs and idiosyncrasies are that we are dealing with KDE projects dot XML. It's a XML file that has metadata about KDE projects that at least Continuous integration and KDE SRC build they use to automatically build and test stuff So that had to be retired and replaced last year So I'll talk about that because we want to take that forward and create bigger things from that and We've had a couple of new services since last year we are currently trying mirroring all our main line repositories on github read only mirrors only and We sync all our messages between telegram and IRC. So I'll talk about those Propagator is it's KDE's first server software. So We're building it in-house to scratch our own itches and we're probably going to be releasing it as a standalone product For other people to use if they find it useful So there's a major a major section of this presentation is about propagator and we've got a couple of new sponsors who Sponsor technology for us. It's not monetary sponsoring sponsorship, but they provide services to us. So we'll talk about those Let's start So we'll start with server inventory again. This is not comprehensive. There's way too many. There's like 50 different servers spread across multiple Platforms so I won't give you a comprehensive overview. So here's what we do. We have We we don't own any servers. There's no servers that KDE we pays for I mean has has I mean, we don't own anything. We either rent servers That that's what we pay for or people have donated machines to us that we again find Data centers to co-locate with again those data centers are sponsoring the co-location for us and we have some completely sponsored servers So we have a private individuals who Donate us capacity on their personal servers or there are companies which donate some Server space or even an actual server KVM machine to us That and machine types. So we have we run most of the machines that we rent or our sponsors are donated our hugely powerful machines that run as that only run as physical hosts for other LXC containers so yeah, we split split them up into smaller containers with LXC and We use a bunch of KVM virtual machines There are some companies that donate KVM instances to us. We use this lotion. That's just KVM We'll talk about that later and The There's something like the machines are all either in continental Europe or the United States of America So this is something that we're going to have to change because there's not enough geographical spread of the servers So that's another thing that we're working towards Okay, so Let's get started with some important service So an an on git the anonymous read only get network It's this only two servers that run the an on git network. One is Mason. That's in Switzerland It's a KVM instance with four GB of RAM a hundred GB of hard drive space and one CPU core. It's donated by stepping stone of Switzerland It runs Debian so Our repository is all the repositories taken together around 75 GB so 100 GB of disk space is enough This is only the git repositories are the SVN is another 90 GB or so Okay, and Edison is one of our newer Servers it's in San Francisco It's a KVM instance with it's a droplet with the one GB of RAM and 30 GB of disk space. We have a 100 100 GB Digital ocean block storage device attached to it to host the repositories Similar stuff for an on SVN. So elder is another droplet same thing instead of a hundred GB disk It has a hundred and twenty GB disk because SVN is the SVN repositories are just usually large IRC services run on a server called spring and that's an LXC host on a physical host called Goma Now this is what our typical physical host looks like it's 32 GB of RAM and 8 core i7 at 3 gigahertz And it has a 2 tb 2 t a terabyte spinning rust hard drive Again rented from Hetzner. We rent almost all our machines from Hetzner And this is actually our most powerful server ever Rick. We just bought this this year It's called recluse. We rented it from Hetzner. It has 64 GB of RAM and two 500 GB SSDs that are mirrored. So we trust ZFS on Linux with mission critical data ZFS is the file system that came out with open Solaris and then Got adopted by a free BSD. So yeah, the Lawrence if you've heard of it Lawrence Livermore National Laboratory in the US has been running a project to port ZFS to Linux and We trust them with this mission critical data. We have had I mean the Resilience features that ZFS provides us is I mean we use So ZFS has built-in mirroring built-in special raid stuff and I mean We feel a lot we feel safer trusting ZFS on Linux with all of this data rather than say a solution based around MD and LVM and ext4 and yeah eight core i7 three gigahertz processor and It's currently under use because we only have code repositories and we're planning to move fabricated to it. So yeah We can I mean capacity is not a problem here. We can Expand the server into other roles Yeah, and so this brings us to our next topic which is fabricated migration. So sometime during the last year Before last Academy Katie as a whole the community decided that we are going to replace a whole bunch of tools with fabricated It's a it's one tool to do project management and a lot of task boards ticketing a lot of stuff and if you've ever seen fabricators website or our own fabricator instance their You know error messages it tries it actually tries really hard to bring some humor into the Process here. So yeah, that's a 404 page. Yeah So, yeah, so what do we use fabricated for so it replaces a whole bunch of things Repo hosting and browsing so we use get a light to have access control on our repositories Who gets to write to what who gets to read to what? And get a light to I'm sorry and quick get to browse the repositories So this is all going to be replaced by fabricators diffusion app. So it's going to handle access control and repository browsing by itself project management We had chili project which we killed off. So we actually don't have a project management tool at all right now so fabricated Projects are currently free to switch to fabricated if they want and a whole bunch of projects have actually switched to fabricated for project management Workboards we have a We have a tool called can board which we used to have your can band boards task boards for Workboards for people or projects, but again, we have seen a whole bunch of projects just move wholesale to Fabricated just because it's so much integrated and it's apparently easier to use Etherpad to a certain extent is being replaced by fabricators work boards It's also going to some I mean if people are using etherpad to create documents Etherpad is going to go away. So the two replacements for it are going to be own cloud or next cloud documents in own cloud or next cloud or Can board sorry fabricated work boards for work board kind of stuff So yeah code review review board the code review tool that we currently use is also going to be replaced by fabricated The new tool is called differential in fabricated Again plasma has been using it for a long time now and they're really happy with it Since admin tickets we have already actually done this. We have shut down trellis. We are only accepting tickets on fabricated and Possibly Katie identity because identity is going away It's a huge pain in the neck to operate and we'll probably one of the things that we are looking at to replace identity with is Fabricated as an OS provider. So yeah fabricated replaces a lot of apps and reduces the maintenance headache because it's just one app to maintain now But again it if if you're putting so many things in so many eggs in one basket It becomes a single point of failure. So if fabricated goes down This is the downside of all of this if fabricated goes down then a whole bunch of services also go down with it So we're probably actually fabricated has something To mitigate that they have actually they actually have built-in clustering support, but Again that that's another pain in the neck to configure. So That it'll probably be a while before we have clustering running, but that's again on the long-term list So what's our migration plan? So we started by installing a test instance of fabricated that was provided on a server on Jeff Mitchell's Personal server Jeff is one of our long-time assignments. So we have so the current fabricator That's running is actually on Jeff's personal server. So we are trying to move it to to recluse Then we then what we did was import a few test projects and let them evaluate how well it works. So In the end all of plus minded up to migrating to fabricated and they're apparently very happy with it How it works the downside of this is that there's way too much data So we can't start over we'll have actually have to migrate the database from the old machine to the new machine Okay, so the next step was to shut down trellis and use fabricated to accept his admin tickets that's also done and This It had another effect that it forced a lot of people who were not using fabricated to create fabricated accounts now Just to be able to file tickets. So we now have a lot more of the user base on fabricated than we did before The next step is to transfer fabricated to the new machine to recluse and For the last six hours Ben has been asking me to do this But I'm just too tired to do this now. I'll probably get this done later today or sometime tomorrow evening But yeah, so this this is what's going to be done in the next this weekend or so So what next after that we'll have to get all the repositories and all the projects imported to fabricated so we only have a couple we have all the plasma projects on fabricated and few other projects, but if we'll have to get everything transferred over to the To fabricated and this is going to be a mammoth task This is not something that we can this is admins can do on our own So project maintainers have to step up and transfer their own projects will tell you how but every project maintainer has to step up and Do this now Sorry, yeah Start using fabricated to host repose so upstream needs to change a couple of things before this can happen I'm going to in a later slide. I'm going to explain this and we'll have to adapt propagated to handle Fabricators we have handling repose and I'm going to be again talking about propagated at a later stage and you'll realize what that Involves and at I mean once we're done using we have we are done setting up fabricated for everything Yeah, yeah, I know I have a counter running here. Yeah, so Yeah So that I mean once we are done Completing our primary migration to a fabricator all our all our repositories are there. Everyone's using it for project management We kill identity and probably try to use fabricator as an as a single sign-on provider So what are the effects so far? So users of fabricator are happy plasma uses it for almost everything. I am a maintainer of one tool the screenshot tool I also use it for work board and stuff and I find it a lot easier to manage things So the downside of this is again, there's a lot of data There's so much data that people have put into it that migrating it is trying starting to become a hassle so but again, yeah, we'll have to migrate it we can't start afresh and upstream collaborates well with us we have reported a couple of bugs and They have been fixed in time and again, and we are conversing with them about features and workflows that we want and They're not completely. I mean they're actually receptive to our ideas and our needs and It's not they're not stonewalling us or anything. It's actually pretty it's going well and Cis admin tickets are have become easier to manage. So yeah So with trellis the old system the problem that we had was that tickets could only be private They could only be seen by the person who files the ticket and the system administrators This is with fabricator. We can choose to make a ticket public. So people so If someone files a ticket about a big infrastructural change others can also weigh in on that ticket and provide their opinion or Their agenda things that they'd want taken care of on that topic and so on and so forth and Trellis was over organized we had like 17 categories for 10 tickets a day. So yeah, that over organization is now gone it's just We actually have a Again, we these tickets their tasks. So we create this task. They end up on a workboard, which we can just arrange by column We can just move things over from one column to another depending on priority and so on and so forth. So it's Just reduced our overhead By quite a bit. So yeah, so where do we need help? So with fabricator We are creating this new concept called the community administrator a community administrator is sort of us sort of someone between a system administrator and Developer so a community administrator will have some permissions. They can they can create repose They can move projects around create sub projects and so on they can also set permissions So we have a few community admins who? I mean the administration workload of fabricator is shared with them But we'll be really happy if you step up to be a community admin if other people step up to be a community admin because we can use all the help that we can get there and Data migration data migration is the biggest headache that we are dealing with right now with fabricator We so trellis our old system ticket system has been shut down, but we still have to to keep our records we'll have to migrate all of the data over into a fabricator Can board our old work board that again has to be migrated to fabricator Or and repository imports. So again, this is where we need a lot of help with especially trellis and can board We'll need someone to look at SQL database dump figure out the schema Get the data and then use fabricators conduit API to upload it all to fabricators So it's a lot of scripting and work and we need some help there repo import it's It's a lot of manual work, but again. Yeah manual work So here's a couple of idiosyncrasies that we have to have we have to solve before we can get Fabricator fully up and running so diffusion hosted diffusion is the fabricators repository hosting app They have numbers in the clone URL. This is the num This is the this is a typical clone URL SSH get fabricator dot katie dot org diffusion slash 191 slash websites arcade That's actually supposed to be it's a space triple namespace triple. So it should actually be website slash www dot katie hyphen katie hyphen org dot git, but it converts the hyphen the slash to a hyphen so It's things like these that we have to get sorted with upstream before we can have you can start So yeah, so you can either have numbers or if you can give call signs you if you assign a call sign to a Repository you can actually use the call sign, but then the call sign has to be all caps So yeah, that that also looks really ugly. Although it's probably better than having numbers, but it just looks ugly Media wiki logins randomly fail against fabricators or provided and for certain people only so we're having a hard time debugging that and Diffusion stores repose on disk with those numbers. So even if you have a call sign on disk in the on the disk where on inside the directory where all the repositories are stored It's going to be stored in a folder called one nine one dot git So we'll have to tap into fabricators repose Conduit API to figure out what one nine one is Before we know what I mean which repository it actually corresponds to or you can manually run Git log, but who does that? It's it's not an option in when you're writing scripts that deal with repositories directly Okay, this wasn't on the agenda, but yeah, I this is important enough that and Olivia asked for this So I'm going to discuss this now Okay, so Katie identity is our single sign-on system. You create a Katie identity account. You have access to all the apps Fabricator review board a can board notes own cloud, etc So it's based around open LDAP which doesn't scale at all and it's incredibly buggy and It's a custom web web front-end called Selena that was written by Ben and it's written in PHP and the e-frame Framework again, it doesn't scale at all So LDAP itself the protocol the way it's designed. It works really well for a user and group database But open LDAP as a software does not Again web front-end works well for the UI. I mean we want a web front-end, but then PHP and you do not so we have What we want, but the tools that we used to build it are Aging we'll have to replace it at some point and The other thing that we are trying to do is that we are having a problem with is spam fighting Spam is a major problem. We have wiki pages tasks on fabricator and entries on in bugzilla that They're filed with spam spam like just random phone numbers of call centers and things and Not only is that is it annoying at some point? We have actually gotten a whole bunch of lawyers Contacting us saying that we are violating trademarks because we have all these things on our website Which is actually and which is actually spam. So Yeah, so we have to stop that so and Except for bugzilla bugzilla is the one tool that's not integrated with KD identity So all except bugzilla need KD identity accounts So if we can stop spam spammers from signing up into identity, we can actually stop the people from spamming So We've had to do that But unfortunately the current method is to crude entire ISP's entire IP blocks are now blocked by us a lot of Indian ISP's Some Japanese ISP's linode digital ocean VPS providers that they're just blocked wholesale because a lot of spam has been coming through them. So These people now have to email the sad men if they want an account So we'll we'll have to change that because we can't just block people wholesale So the potential goal for solutions would be If you're if and this is assuming that we don't use fabricators and as the SSO we use another custom solution based around LDAP So Fedora directory services or straight in ideas should be will be used to replace open LDAP We'll provide open ID and OAuth in addition to LDAP. So a lot of apps media wiki For example, they work better with OAuth than with LDAP So we'll just use the database to provide our own OAuth provider and LDAP groups the permission groups they translate well to OAuth scopes and open ID also has scopes So we can actually have the same fine-grained permissions that we have with LDAP with OAuth and open ID We'll need to we want we ideally want to move to a completely To a solution based completely around OAuth But we can't there are certain tools including our mail servers for example that Won't talk to it's not a trivial task trying to get them to talk to OAuth because OAuth was based around web HTTP and we need those people those tools email service and Jabber for example, they need something that can talk to an authentication server With a protocol that's based only if that's built for authentication okay, so and To combat spam will probably need a probabilistic model to detect spammer. So we take Which ISP are you on? What is your IP has is it on any spam database? Which OS are you running more people are likely to sign up for us you who are running Linux than people running Mac or Windows? So That could be a weightage factor have they signed up in the recent past do they already have an account are they trying to create a Duplicate account so we could try to identify the browser through and that has privacy implications, but if we can actually do it we can you know get another factor to wait against and Yeah, so we get all of these factors and then we create do Bayesian filtering fixed weights for on for all of these factors and then we do We run some math and then then we figure out it on a scale of zero to one how likely they are to be a spammer and then probably Divide them into three queues a green queue. So yeah, you're your account can be created In yellow queue. Okay, you look slightly dodgy if you're going to Manually approve your account and red. So you're definitely a spammer. You're not getting there Not on the slides. We have been in talks with people at Fedora infrastructure they actually have built a tool that does something like this called Bassett and We may have used for Bassett in doing the probabilistic spam or detect spam detection. So Again, it's just tools that we have to bring together to build a solution Okay, so now KDE projects KDE underscore projects dot XML replacement What is KDE projects dot XML? So this is a URL at which you can access this file So we used to have red mine for managing our projects way back and then some Political situation ended up people ended up forking red mind to a new tool called chili project and We started to use it and Then chili project died and at that point chili project had changed enough that migrating back To red mine wasn't a practical solution at all. So chili project died. No path to migrate backs security bugs that weren't being fixed Chili project our chili project server started to go down every time Google tried to index it because just way too much server load So yeah, oops, we had to kill it without a complete replacement but there was this one part of the project infrastructure that we couldn't kill the KDE projects XML file because Yeah, Scarlett and Michael these two people would have Had our heads the moment we I know you're there. Yeah. Oh Excellent. So that's just one person to worry about now Oops That's back. Oh, right. Yeah, I had to get your approval before changing the files, right? I forgot I'll add the name there Yeah, right. So yeah, so descriptions repo parts a list of maintainers and yeah used at least Why I mean off the top of my head. I could only think of CI and KDS. I see but now, you know, yeah plasma build infrastructure and Translations all use this file to figure out metadata for projects. So we had to keep this around So How did we replace it? This is the first stage and this has been done So without chili project as a database for project. We needed a data source So what we did was we took the last available KDE projects file and then ran a cup bunch of scripts to convert it into a directory structure represented in the file system and The project hierarchy itself is represented in the file system and inside every directory There's a YML file that describes the meter data for that project The name the description the maintainers whether the repo is active the pass to the repo and so on a separate JSON file describes the I 18 and branches for every project and A Python script then does the entire work in reverse it reads all of those YML files and generates the XML file Runs every 30 minutes. So it is it isn't live the previous Version actually had to be gen I mean the file was generated every time someone made a request to that URL But this time now it's cached for 30 minutes, but it's way faster Even the live generation with chili project used to take at least 15 seconds to generate the entire file and then downloaded back to the user. So it wasn't a very fast solution With the new Python solution it on the server that we're running it on we're running it on Yeah, whatever I think Mason, yeah, we're running it on Mason So it isn't a very powerful server as you saw it was only a KVM instance with a hundred GB of disk space and Yeah spinning rest hard drive, but it actually does the work in less than three seconds It generates the file in less than three seconds running through thousands of subdirectories And we do this every 30 minutes So this is what has been done so This is Okay, this should actually have been Replace the front end. So this is on our long-term to do list We want to replace the XML file with a restful JSON API service So basically a rough cut of the API is available at apps dot KD dot org slash API and documentation is available here But this is subject to change. This is something that I've just been hacking on in my spare time And the idea here is that you don't get you don't have to download the entire 1.8 megabyte KD projects or XML every time you want to make some changes you just make an API call to Mentioning whatever project you want to access in the URL and only the relevant chunk of data is returned to you as JSON file So this is going to need changes in KDSR disabled if we end up deprecating the XML file is going to need changes in Translation so on and so forth. So yeah I'm going to We'd like to hear some comments about how well this is going to work out for the different teams if we go end up doing this and Steps three. This is I mean once we have created an API service we'd like to be able to Use the API service to have to generate the KD dot org slash application sub-website Because that website is outdated and we have data in two places We have some data in the website itself and we have some data in the In the kid the project metadata repository and we'd like to have all the data in the project metadata repository and this website should just use that data to generate pages for the Applications so the new plan is to get data from the API service get more data from upstream metadata because that has a lot of That has screenshots icons translated Human readable names and translated descriptions and so on so we want to use that made data And we use we want to use all of that data to generate the dynamics to generate a dynamic site from those Sources so we won't have any outdated web pages because once someone updates the upstream metadata. We have an updated website so That has I mean we'll have to coordinate that with the I mean I think the new KD There's going to be a new KD dot org website. Olivia is working on this. So we are going to be We need to work out a solution if on how this can happen and Finally once fabricator becomes our Thing for everything we are we may I mean there may be no need to keep the the KD metadata repository around anymore So then fabricator becomes our the fabricator API conduit becomes our data store again So that's how The KD projects or XML file is going to work out Okay new services We now do github mirroring courtesy of a mail sent in by Martin That started off a huge discussion a hundred and I Think over a hundred and forty mails spread across three mailing lists way too much bike shedding But then eventually we settled on a plan to create a read-only mirroring system No issues tasks or pull requests. We are going to ignore them altogether. We there's no realistic way We can manage data in two places. So all the development happens on KDE servers on KD infrastructure, but we have our repositories on github so whatever As long as you make commits to the KD repositories with the same email ID that is Associated with your github account the contributions show up on your github profile. So it adds to your github profile you get discovered by potential employers and We get to rely on github in case our an on git mirror goes down. So the code is always available and Currently it syncs real time from code or carry dog So the moment you make a push to code or to get dot carry dog the new commits are pushed out To get up immediately But github's git ref systems are interesting. So on push it only accepts tag and head refs and We actually do use a couple of custom refs for backups of the repository. So we can't push those So we can't do a git push high-five in middle there. So And on the github server side, they also have custom refs all pull requests end up being a ref so we don't want to Have that affect our mirror push infrastructure. So what we do is we don't we can't do git push mirror. So we do a git push mirror, but We actually Use the dummy flag with it So we use that to get a list of all the refs that are going to be changed Then we filter that filter that list of refs and only push the updated head and tag refs So it's a two-step process, but it works fine. So I Don't think there's any need to change that All right, the other service that we have started is telegram sync So KD only maintains IRC as the KD sysad man only maintains IRC as the official communication channel. Whatever happens there has some I mean we consider it To be official so we can actually refer to it as Proof of conversation happening around something But people ended up talking on telegram anyway. So it's an app. It's the cool thing. It has mobile support They claim they're secure, but I don't know how that matters here And it's actually way more convenient. That's the killer deal here That's the kill the deal maker here. It's telegram is just way more convenient so people end up you using telegram for practical reasons and we want IRC as our official communication channels for reasons and So we decided Let's just not treat them as second-class citizens. Let's just bridge the two So anything talked about on IRC appears on the telegram group and vice versa So if you talk on telegram you the messages get synced with the IRC channel the leader IRC channel so Ovid you from the Kubuntu project was running a tele IRC bridge for the KDE Summit of code project. So we just took over that kind that infrastructure We used it's so tele IRC is written in Node.js. It's a terrible idea for a lot of reasons. It ends up using I think 30% of our servers RAM but it works so If it works, we are happy. So I mean we have infrastructure we have capacity so We don't have to worry about resource usage now. Thankfully. So it works So, yeah, so we mirror a lot of channels. We use we mirror 1080 channels You guys are here. You've been talking on the Academy and the Qtcon telegram groups. They're all synced to the hash academy and hash Qtcon telegram IRC channels So 10 KDE channels to wiki to learn channels and five Kubuntu channels And if you want to have your channel and your projects telegram group synced All you need to do is file a sysadmin to get on fabricator and we'll get it done Or if you have rights to access the sysadmin IRC notifications dot get repository yourself. It's just a JSON file. So It's a fixed matter. Just make the push and every 10 minutes The service tries to check this repository for new changes. If there is a change It will pull in restart itself and the new and you're done right Propagated the KDE is first server software. So, yeah, so we have We have broadly five different types of repositories that we handle. We have project mainline project repositories We have scratch repositories, which people use to run their I mean Experiment with code or keep their personal files around we have personal clones which are just clones of project repositories on Their own account. So you you can clone a repository wholesale and not worry about Breaking other people's workflows Websites are also stored in get repositories. Also, although some repositories are still some websites are still stored in subversion But a whole bunch of websites are stored in git and sysadmin stores all their opposing gates in in git. So and We have a single server running git dot carry dog and four servers running and on git mirrors But only two of them serve publicly get quick git and review board backups they look up look it up from a private and get mirror at Mimi and Chili project chili project doesn't exist anymore fabricated pulls from projects dot carry dog Which itself is a private and on get mirror That last line will go away now because fabricated will end up running on git dot carry dog So we'll be down to three servers we have a read-only GitHub mirror and The hardware all this hardware is either in Germany or Switzerland or Erisid is in San Francisco. This is the num This is the kind of things. This is the scale of our operations as on March 2016 I'll admit that I only I didn't look at new numbers for this I just copied this slide over from my presentation at construct kd.in Which is why the numbers are from March 2016, but yeah, so It I mean the number of repositories changed by plus minus 10 during that time. So yeah 2218 repositories off with 745 are mainline repositories. So those 745 repositories have to be synced with github And the rest are scratches clones and sediment. So about a hundred thousand commits a year's ish We had nine thousand three eighty eight Commits the last year it hovers around it hovers from 95,000 to 120,000 thereabouts And this this number is only for projects is admin and scratch so not for websites not for personal clones, etc so the number may the real number may be a bit bigger and So we use identities LDAP database to store as sage keys and then get light to pull those keys in and define access control rules and have Pretty URLs and quick get for repo views So this ties in so how do we sync repose so someone does a git commit to git dot kd.org And we have custom hooks running on git dot kd.org that Notifies every an on-git server via HTTP API so every an on-git server currently has a Small diamond HTTP diamond written in Ruby. That's just listening in and we make a call to that particular end point and Then that an on-git read so git pull to pull things in on their side Unfortunately, this does not work for GitHub because changes we don't have any control over the in server infrastructure at GitHub So we'll have to push changes the normal way or the I mean with all the filtering and things and We don't know if a sync failed although this has saved us in the past because in 2013 we had a huge We almost lost all our repositories because of a disk crash in one of our servers Apparently the thing that rescued us was that we found an on-git mirror that Had not been syncing for a while. So it had a pristine copy of the repositories around and we were able to recover from that But yeah, in general, we don't know if a sync has failed So and why the why are we using HTTP when we have SSH? It's a more secure way of doing things. It's encrypted So, yeah, the solution here is that git.kd.org pushes to both an on-gits and github. So we have the same thing happening for both types of Remotes so we just do a plain git push mirror to the an on-git servers and we do the whole filtering thing and then we find out which refs we want to push and Then we manually push those refs to github the smarter solution here is To have a long-running diamond on git.kd.org that logs failed pushes and retries them periodically with increasing backups so It tries five it if it fails it tries five minutes later then 15 minutes later and then 30 minutes later and if it doesn't work it fails and that notifies this assignments With an email or something that this is this isn't working. You should probably go take a look and Because we have this data available. We could probably have a pretty web server Server status web page that people can just look at and find out whether a particular server is live and syncing currently or not And to achieve that we've been we have I have been working on a project called propagator. It's written in Python and it's a Propagator is probably KD's first server software project So what is it? It's written in Python 3.5 and pika pika is a tool that uses that pika is a library it's a pure Python implementation of the AMQP protocol and that ties in with rabbit mq, which is the server that we used to handle our cues and Circus is the We had a choice between supervisor D and circus I found circus easier to configure So I just use circus first a process pool management and there's a library called git Python that ties in Libgit and Python together. So that's how we manage that's how we tap into the repository with Python and this is a Broad overview of how this works. So git hooks on git dot KD rock runs a script That script sends a message to an AMQP exchange. So I'm going to explain how this works So AMQP has a concept of exchanges in cues So you send a message to an exchange a whole bunch of cues are connected to that exchange and The exchange will automatically fan out that message to all of the cues. So it's basically like a published subscribe system you publish a message to a exchange anyone who is interested in that message in that In those messages Subscribes to the exchange by creating a cue that's hooked to that exchange And so this is how it works So suppose we have two types of upstreams and on git and github and we have four workers for each type of upstream So that's eight workers in total. So the four and on git workers share a common cue and The four so there'll be two cues one for each type of upstream and all the four and on git workers are listening in on the an on git cue and the four github workers are listening in on the github cue and And the AMQP fan out exchange will deliver the message to both the An on git cue and the github cue and one of the workers subscribe to the so only one of the four Workers listening in on that cue will be able to dequeue the message. So there's no duplication per remote type So only one and on git worker gets the message only one github server gets the message but both types of workers get the message and So the workers are just sent messages that this repository has been updated. So They can end up doing completely different things not just sync repos. They can have custom tasks custom shell scripts that will run per repo They can have See I can be configured to be triggered from this instead of the one massive script that we have right now We can have things that mirror that That works somewhat like github's web hooks We can probably have a website where you can actually create your own web hooks and do funny things with when repositories are synced updated and so on and Plans for propagator. We roll out concurrently with fabricators. So probably in We actually are so fabricator is built in propagator is actually built in two parts There's one part that's running on git.kd.org And there's one part that is running on the an on git server itself Which is accepting the pushes and creating the repositories if they're not already created on the an on git and then accepting the pushes So we are already using the part That's meant for the an on git part in our new San Francisco server is it and it seems to be working fine The server part we are going to roll out concurrently with fabricator. So probably going to happen in in a month or so so and Long term when we have time we'll probably try and build that dashboard web UI Type out there. Sorry. I'll fix that So, yeah, I Actually made this light today morning. So and I didn't have enough coffee before doing this. Sorry yeah, and This brings us to the final part of our presentation new sponsors So sometime earlier in the beginning of this year CDN 77.com actually contacted us offering to Offering CDN services. We already use in capsular for CDN, but then we Figured out that CDN 77 and capsular different enough in their design their API and so on and so forth that we could actually make use of both so we have been using CDN 77 to serve static content on a bunch of websites including kd.org even Krita and Krita every time apparently Krita is one of our most popular apps every time it has a new release The numbers the load numbers are huge So CDN 77 has actually been able to we didn't have to provision additional servers on our own Infrastructure to handle the additional load created by a Krita release CDN 77 has completely had completely absorbed it The second big sponsor that has joined us is this lotion. So this time we asked this lotion for sponsorships because Dislations droplets are like perfectly sized that small enough to be dedicated to one task one isolated tasks Task and we actually wanted that something like that So we asked this lotion to sponsor us They have a program for sponsoring open-source software projects with hosting and related things So we asked us and they very enthusiastically agreed to doing it and then just a month after they agreed to sponsoring us they They started this new service called block storage where you can just add Additional hard drive storage to these droplets because droplets only have 40 or 50 GB of hard drive space And you can add block storage to these droplets. So this works really well for us We are currently using it for hosting mirrors for SVN and Git Only in the US because locations are limited to block storage areas and block storage is only available in the US in San Francisco in one of the two data centers at San Francisco and one at New York, yeah, oh Right, so right, so they this must have happened really recently. Yeah, so yeah So they asked us We have been having active dialogue with them and they asked us what how it's working for us And we basically told them that we're limited by your block storage locations We want one in Europe and one in the Asia Pacific to server Asia Pacific user Pacific user So yeah, so now Amsterdam has worked out. So we'll probably be spinning up a mirror there and Hopefully we'll have one either in Bangalore Singapore, so we'll have one server to serve the massive KDE India community. So that's gonna work out Yeah, and that's it questions questions, who are the questions? Olivia I have a question related to the also project structure that we have everywhere because It seems there. So there is this one project XMA. We use another one in it for the API Is there is another one for apps for at stream? I think yeah, is there a way to have just all the information either In the repo or somewhere so that we don't duplicate so much information that has been one of our Yeah, so that has been one of our plans. I've been trying to get all of this Introduced into one place, but then apparently it appears that it's easier for the people to manage their own app stream data in their own Repositories and it's easier for us to manage our sysadmin data in one consolidated place in sysadmin repository So, yeah, so this has to be worked out. We have to find a way to not duplicate data anymore, but currently We're still looking for ideas on how to do that Probably one of the things that we may end up doing this may be slightly orthogonal to your question, but We have a path describing where the repository where the project resides. So it's the KDE KDE graphics slash Spectacle we may change that to have so all projects just have a flat Category there's no path anymore. We just change the parts to a category or tags or something That's how blogs would work So, yeah, no questions. I Do have a question you mentioned you were going to remove the etherpad thingy. Yeah, I find it very useful Yeah, so the problem with etherpad has been that it's just it just crashes way too often again another node.js application It's doing funny things And people have been because it crashes so much. We have had people just end up using Just get infuriated with it and end up using on clouds documents So, I mean people themselves are trying to switch to own cloud documents. It's not us pushing them to so If we see that enough people have we've seen that enough people have moved to open on cloud documents that we think it's probably Not worth the additional effort on our side to keep if it bad going are there any plans to migrate the wiki to fabricator as well probably not because Media I mean media wiki is built for wikis and we use the advanced features in media wiki. So fabricated just I mean it's just not Advanced enough as a wiki tool for us to be using it So that's why we're going to keep media wiki around What about scratches and personal clones are they also macrated to Yeah, this is why we have we are going to need community admins because Personal clones and we may not have personal clones anymore, but scratches at the very least you don't need approval from sysadmin to be to create a scratch Reposo will need to have Community admins will just be have the rights to be able to create those repose for you We basically want everything on fabricator so that you can you guys can get to use the code review features and The repo browser and everything so it just makes sense for everything to be there Yeah, Martin had a question. I think Martin Graceland That was mostly to the note thing because we mostly use it to Collaborately right like really right at the same time and Yeah, either pet. Yeah, I think own cloud document Documents allows you to do that simultaneously, but I mean Really, we want comments if you think that either even it's worth the trouble of keeping it around you should just come and tell us We can try it, but it's just I wanted to be sure that that collaborative aspect At the same time right thing. Yeah. Yeah, it is there. I mean just have a look see if it works well for you Any more question? Okay, no more questions. Thank you Okay, thank you