 I mean, that's pretty well understood. The problem is the road for the whole of the band in the car. I just had to do some major off-roading to get to the house. Hi. First of all, a warning. This talk was mostly intended for people who are planning on contributing to infrastructure and gives a very high overview of how we're running things. So if you're not interested in that, I would suggest you to go to one of the other interesting talks. Just a warning. Stop them playing your own talk, and just go ahead. I'm just warning. Well, if you have serves in the data center, why not? So first of all, what do we do? There's sometimes a lot of confusion about what we do and what we do not do. What we do is we host a lot of services, like the Fedora websites, the Wiki, the mailing lists, the tools you use to actually build and contribute to Fedora, like Koji, the package maintainership stuff. We also use a lot of services internally, FedMessage, for example, of which the author is sitting right there, Rolf, which we use for managing internal stuff and kicking stuff up. We run a Fedora infrastructure cloud for a lot of testing stuff and other stuff to help the community. We run some web services which are not Fedora specific. Some of you might know them, Peggy or Fedora Hosted, which we run for the rest of the Fedora community. But most importantly, we do not run your own home machines. You will be surprised how often we get people in Fedora admin asking, hey, help me with Fedora server. I've just installed it. We are not there to help you for that. Oh, actually, the beggar author also just arrived. Actually, you just wanted to tell me how to pronounce it. Oh, yeah. Bagheera? Bagheera? Bagheera? OK. Yeah. Bagheera. Yeah, a lot of people have that. He's French, and he thought it up. Also, if you have any questions during the talk, just let me know. Because as I said, I'll be at a pretty high level. If you have more in-depth questions, just let me know. So the places we host stuff, we have two data centers hosted by Red Hat in Phoenix and in Rally, where most of the stuff is located. Rally is mostly a download and backup site for us. So when you download Fedora from the master mirrors, you might end up there. Pretty much all of our services further are hosted in Phoenix. And we have a bunch of donated machines, as I said to you, hosted by companies which just have provided us with one or two machines where we host part of the infrastructure used to get stuff close to you. My suggestion is, anyway, try to also grab slides and cover the screen. Say again. This dog tries also to get the slide properly and cover it properly. Is he guessing you stand right there when you're talking? Yeah. Right. I will have to walk here. OK, yeah. No, it's using cheese. So I don't think so. And apparently when I stand here, it can't even see my face. Let's see if that works. It does work, actually. Cool. Thank you. Yeah, who figured that this would actually work? So we have a lot of data centers all around the world where we host most of the, sorry, where we host most of the reverse proxies to which I will come in a bit. We also have, so most of these machines are just normal machines that are hosting our stuff. We also have in Phoenix a infrastructure cloud which is based on OpenStack. We have actually two of them. We set up one, I don't know how many years ago, two or three years ago with OpenStack. Very long ago, Folsom, whenever we asked for support, we always get it told, like, no, don't use it. Everyone's given up on it. But we actually set up a new instance a few months ago with right on OpenStack platform now. So actually we're now running a supported release which might, which would actually help us. This setup is used for, as I said, a lot of internal testing and development instances by developers. We're also hosting instances for Buildbot and we're hosting Jenkins instances here and copper runt on this, which is a very interesting combination to which I'll come in a bit because it has given us a lot of pain. Copper is very nice for the users. For maintainers, it's a bit annoying at times. Sorry? Yeah, which is exactly caused by what causes us to pain. So you did also build builds randomly and then build successfully if you resubmit it. Right. The same exact source aren't being able to know how to use it. Right, but most of the time it works. It's not very repeatable and it has some bugs, but most of the time it works all the time. Could I just suggest that you make that the team motto right now? Sorry? Most of the time it works. It already for structure, most of the time it works. That's a very good one, maybe. But Paul, how about it? Most. All about metrics. Sorry? All about metrics. I see. More than 40% of the time it works. Zero nine. Yeah, perhaps. Five eight sometimes. Okay, so this is what we're hosting, but this doesn't come automatically. As I said, this misses some parts. Let's try it. Let's try it anyway. Our main data center is in Phoenix, Arizona. We have 22 virtual hosts there. We have the master middlers for download.project.org are located there. Actually know that DL download is forward to middlers. We also have the build hosts here for Koji, and a lot of other related stuff used by the infrastructure team. The other site we have as a main data center is Rally, where we host the backups off-site, and we have the other download middlers there. And that's where we had internet to middlers, which we actually shut down a few months ago, because we weren't getting enough bandwidth. Sorry? Yeah, but the link we had was not very high quality. So we also have some remote data centers where we get donated machines from other companies. We use them as service hypervisors on which we run proxies, reverse proxies. We host middle lists, and we host DNS servers so that they all are local to where you want them, because our DNS actually has a split horizon based on your approximate region where you're coming from so that when you're from Europe, you're most likely to get a European proxy sent here. Also, at some data centers where we actually got more servers, OSUSL and dedicated solutions come to mind, we run other services like Fedora People, Fedora Hosted, and some that we cannot run in Phoenix too, like torrent servers, because Reddit will not allow us to. They will not allow us to open those ports. Makes sense, but... So these are all hosted in remote data centers on machines donated by other companies, for which we are very thankful. Are you saying it? Anicha? Right. Yeah, in fact, you're... I think also, that's all at OSUSL. So, in our... the services we run in, for example, the websites, our websites are actually not... Oh. Websites maintainer, Robert Mayer. If you want to help build the websites, ask him. He can help you point in the right direction. They're not dynamic. They're actually built statically every 30 or something minutes. Hourly. Hourly, I had just heard. And then synced out from the build server to all of our reverse proxies, so that the proxies themselves can answer the main websites so that the traffic for that doesn't have to go all the way back to the main data center in Phoenix. This also helps, because as I said before, you often get a mirror or a proxy that's close to your location, which means that your actual... when you request the website, it will come from pretty near you instead of all the way across the Atlantic sometimes. Although we don't have any data centers in the Asian region, if you know any, please let us know. Another service we run in all of our data centers is the MetaLink services. That's used by Yum to actually get your closest mirror so that the MetaLink actually gets a list or determines your approximate position based on IP address, finds the correct or finds the closest up-to-date mirrors in its database and sends you the list of that, including the correct check-ins for the Yum data files. The crawling happens centrally in Phoenix too, but then every hour or something, we create a new centralized or a database file which we sync out to all of the remote sites so that you can actually get that faster and it's more distributed over the world. One of the most interesting parts of our setup is how we use or how we host web applications. I've been thinking very hard about how to illustrate this, but let's try it. Our web applications are actually hosted in Phoenix too in our main data center, but they are being served to you through our reverse proxies, which are in the satellite data centers, or Phoenix too. These colors are very hard to see. As soon as you request one of our web applications, your computer will send a request to the proxy server, which will then terminate the TLS connection, so that's where your TLS connection ends. After that, it all goes through a private VPN we run, but internally the drive goes all this HTTP so that the backend servers don't actually need to spend time on processing HTTPS requests. Also, all of the proxies run a local load balancer daemon so that every request you send gets sent to either one of the application servers for the application requested. So we have two servers for, for example, Nuan Xie. Is that how you pronounce it? Sorry? That's enough. Okay. You should, you'd need to tell me how to pronounce it sometime. But, so we have two servers for Nuan Xie. We have two servers for package database. They are all individual virtual machines. An advantage of using that is that we can update one of them while the server itself stays up because the load balancers will all hit the other proxy that has not gone offline. The only part that in here that is tricky is our database instances because we have just a single database backend. So if there's a big schema update, the servers might still need to go down, but otherwise package updates should be, should be able to happen without any users noticing it. All of the proxies, so all of the proxies are actually running a local varnish cache as well so that static files for applications or wiki pages are actually cached at the proxy so that they don't need to be shipped all around the world and they don't change that often. We don't cache dynamic pages when you're locked in. So for example, package database and Nuan Xie, when you're locked in, it's no longer being cached, but if you're locked out, since your page will look the same for everyone else, it will get cached on the proxy. You do not have caches on the application servers only on the proxies. So after it passes through the local cache, it will pass through HAProxy, which is the load balancing service we use. After that, it will get sent back on to the backend in Phoenix 2, maybe all across the world, from Europe to the US and back and then back to you. So are there any... Where on that diagram does that happen? Certainly which part? Suppose they hit the proxy in Europe. Where in that diagram is the transatlantic network? Right. So your request comes in at the proxy, HAPachy, in Europe. The proxy runs varnish, the proxy runs HAProxy, and this line is transatlantic. So the line from the HAProxy to the backend server, in this case, the green line, which might not be very visible. So the line is not the scale? Sorry? The line is not the scale? No. Well... That might take 3,000 months. Yeah. It might even be, I don't know, a lot, but yeah. Any other questions about our reverse proxy setup? Because this is what is quite difficult for a lot of people, because it also makes developing applications more difficult, because they actually need to support load-balanced setups. Like you can't store anything in a local disk without using some sort of synchronization for which we often use Gluster if needed, and otherwise we try to put everything in a database. If I were to decide, if I were to create a new cool Fedora application, or an application that does something support and I want to run it on this infrastructure, is there a documentation that leads me through it and says you have to do this if you want to set up studies plans? Yes. Or do I have to go and try and ask you all the time? No, we have a... So you mean what you will need to do to get something in our infrastructure, right? Not to get it, firstly code it so it will be compatible. Right. I don't think we have any documentation for that, but as long as you don't use the local file system, you should be pretty much fine. So basically no file uploads inside the application? Right. And if you do need that, which for example, asked us for a project with org-dust, we have the two nodes share a Gluster instance that synchronizes the file systems. Sorry? Open sheath.line. Right. Where there is the option to use that code or is that helpful? So we use... So we use OpenShift online for just status through a project at org. Since that needs to be stored outside of our infrastructure. I mean, otherwise we can't tell you what's wrong. But the rest is all inside of our infrastructure and hosted by us. There are some things that run on OpenShift but we just kind of throw it online and say, well, that's not the infrastructure problem. Yeah. Any other questions, Pierre? I thought you had something. So one of our services that are used across the board for every other application is our identity service, which is, as I said, is used by every one of our applications. It uses authentication against the account system. The applications themselves use OpenID to connect to the identity server. If you want to hear more, I would suggest you to visit Rob Crittendenstock tomorrow in this room because this time he's the one talking about OpenID and federated identity. But basically it's a central server where everyone logs into. We'll probably have seen it. It's the login screen for pretty much every web application we have at this point. Yes. Everything. Although for Wiki and Bodhi, those are coming up. Are you trying to move over? You can, but you will need to do some things locally to get it to accept insecure things. Which I would suggest you not to do, so I will not tell you how to do it. Does anybody talk to the red-head bugzilla about allowing OpenID logins to their bugzilla so we can stop having separate accounts? Yes. We have a ticket open on that. I don't even know if bugzilla supports it, honestly. Well, it does. Bugzilla has a plug-in for it. It had been unresponded to for, I think, over a year. And if I recall correctly, one or two months ago, we got an update saying we are now looking into adding it to the process. So, who knows? Who knows what's going to happen if you want to. I can send you the ticket number. I can search for it unless it's private. Yeah, I think it's public. We're trying to do less for a page than that. Okay. You're one of the maintainers? I don't want to, but I did work in the office with the guys. Ah, I see. Because it's a great problem. Yeah. All I know is that they suddenly responded, we're now looking into it. Okay. We'll see what goes on and what happens. So, unfortunately, I don't have a clear answer to that yet. That's fine. At least someone asked. Yeah. The other thing that we have to change, because currently it's, you know, in terms of federal project ideas, as it's emailed, that's home of an ID account. But it's an account site, so it's not very useful for us. Right. For this specific request, or on a specific post, which is a feature that we are already, well, it's related to some other features we're already working on. First of all, and second question is, well, maybe you should ask him outside of the talk, because he's the fast maintainer. And there's been a lot of talk about having free IPA as fast. I think the current plan is that we're, since fast three is so far along, we're going to move to fast three. And after that, look if we can migrate to free IPA, because that's going to take a long time to actually move, because it will not be compatible with any of the interface we have at this moment. Right. Yeah. And we heard that, that got pretty much ready now. Right. But we still will have to modify every other, every other applications. So what is the right time to have this conversation with us? We need to understand if we have these ideas, then anything else that we need to do to make these migrations smaller? I think that we should just come together with your and our team and see what we can do here. I mean, we have multiple days here at Flock. Okay. Okay. So one of our other services that probably a lot of people know about is Copper. We, it's two parts, or actually three, but I'm glossing over the third, the key gen part. We have a front end, which is the part that you actually see and where you're actually seeing the 50, 50 weighting builds that are crashed. We have the back end that actually should pick them up and doesn't, and which makes you see those 50 missing builds. All of those built, so what the back end basically does is it retrieves the current list of two build items. It submits, it picks the first one of them. It spins up an instance in OpenStack which tries to build it and should report back. So, most of the time this works, but at some times we get issues where the Copper builders are not getting cleaned up and that's what you see when you are seeing 50 or sometimes 200 weighting builds. So, and one of the reasons we spin up so many is every single build gets a new virtual machine which has the advantage that it is very separate, so one build does not impact any of the other builds. It does have the disadvantage that we spin up so many instances that we tend to hit OpenStack bugs that a lot of people would not hit probably. Sorry? That's true. Is there anything that anybody can do to help with that? So, and generally packages in Copper? Yeah, sure. Well, I know some of us have access, but I don't want to do my covers. Well, no, so... Is there an SOB for resetting that? Yeah, for that you need to be an OpenStack admin. So, you won't be able to do that unfortunately, but the issue is also, it's caused by Copper not shutting down its builders completely. So if you just talk to them and try to fix them, their issues, you would make me very happy because I spend approximately one or two hours a week on this. So, there's quite a lot of services. How do we manage this? Because we're obviously not logging into every service to set up everything manually and that would not scale at all. We use Ansible for that. The repository for that is publicly available. Link is there, but it's also easily findable by just entering in your favorite search engine, Google... Ansible. Ansible actually creates the entire virtual machines. So the only part that we manually install are physical machines, which is kick-started. After that, Ansible kicks up installation of any virtual machines to find, which makes it very easy to set up a new server. Was actually running on the physical servers. This is around 6 or 5? At around 7, most of them now. Some are still stuck at around 6 because they have very interesting edge cases. Like, we have one... Why don't you touch it because you will break it if you touch it? No, that's not really the case, but we have... We have some servers where we would need to migrate VMs off, update it, and move them back. Where we only have one server in the data center. Another instance is we have one data center where the server has a very weird buck in the video driver, so the REL7 grub doesn't really work. I have a screenshot if you want to see that. It's very funny. Have you ever seen the movie Star Wars? No. Nobody here has ever seen it. So, the title screen, which looks like that, that's how it looks there. That's great. It's very readable. So, we also do regularly. We run a master playbook which executes all of our playbooks in Ansible, with checkdiff, so that everything that is defined in Ansible, which is not currently in production, will get reported because people do not always run all of the playbooks they modify, which makes for very funny bugs when we actually do run them and their configuration was not correct. It happened more than once, unfortunately. So, we obviously shouldn't try to debug in production. We do. We should not. So, that's why we have a staging instance and a production instance. This is also where it was coming for you. If you have a service that you would like us to run, we have a documentation request for resources that will actually define that you need to first put it into staging with the Ansible playbooks and then after it gets approved we can put it into production. The staging and production instances are completely separate. They are not able to talk to each other. Do you have a different process account to study? Yes. The fastest system is set up entirely itself. Every service we run is set up staging and production. Do I remember fastest snapshots that you take in? Yes. Sometimes we sync from production to staging because people edit their buzzword in production. Yeah. So, the permission system is managed by through our account system group membership. Have the apprentice if you want to start contributing to through infrastructure just let us know we will add you to this group. This will let you as a stage into a lot of our boxes and look around. It will not let you do anything bright access related. For that you will need to get into SysAdmin and later if we think you are helping a lot in a specific area it might add you to other groups. But please don't ask for them we will tell you when you should join them. So, how to get started with help? Sorry, that's the wrong one. How you can get into contact with us if there is anything wrong with services that we provide? Well, either come by us on through Admin, on FreeNode send an email to the mailing list or file a ticket if it's something that takes longer so we can actually track what's going on. What's the difference between Fedora Admin and Fedora Apps? Apps is more meant for people asking about the web applications and the people developing it contact each other there Fedora Apps Admin is more like for you as an end user to ask for help. Not that kind of end user. Yeah, not that kind of end user. User of our web applications. Fedora NSC is for the infrastructure people to communicate in case of problems and other network or systems operations related issues. So how you can help us because we are always happy to take on any extra apprentices that will be able to help us because we're just a team of just a few people we can hardly do it everything on ourselves. This is a page where you can find information on contributing to Fedora infrastructure. This is a page which is not actually only for infrastructure but if you're looking for things to do please take a look there it might not have everything it will often have some tasks to get you started just ask us in Fedora Admin or ask any of us in Flock. We're currently in the door so just take a look at who's there and ask one of them or me. So are there any questions left? Again I see 4 o'clock in the schedule state of the well-known third party repository it's about RPM confusion I don't think you're How hard it would be to take all your principal playbooks and just go and replicate it somewhere where you can do RPM confusion stuff if it's usable for others easily because the infrastructure of that repository is very bad I think you'll know about it in that talk but if we can reuse what Fedora already have or not You can certainly reuse it but in their case they already have a setup with a lot of things which are specific to them If they want to pick up our principal playbooks they're more than welcome to but they are with even less people to manage it so I mean how hard it is Do I just require some virtual servers that doesn't hit the magic button and Fedora infrastructure starts from the playbooks or I won't Most of it A lot of the services just hit the button and you will get pretty much everything we have There is a private repo where you have to guess what's actually there Right, well you don't Yeah, so that's the only part that's a bit tricky We have a private repository where the keys are stored because we don't want those to become public but Ansible will tell you in itself like Hey, I need this file to find it and often the name is pretty self-explanatory so you should be able to find what it is that you need to put in there and if you can't figure it out as I said just ask one of us we'll be glad to help you because we're always glad to have other people use our Ansible playbooks and everything else Any other questions? Thank you