 Okay, hello everyone and welcome to the last lightning talk of the conference I hope you've enjoyed yourselves. This will be the Faldem infrastructure review same as every year presented by Richard and Basti So normally I do this thing, but Basti has been helping a ton and the ball of spaghetti and spit and duct tape Which I left him he turned it to into something usable So I'm just going to sit here on the side and I'm here on for the Q&A But for the rest it's Basti and it's his first public talk for real see so give him a big round of applause Well, thank you. I hope won't I will not screw this one up. Okay, so We'll have about 15 minutes and 10 minutes of talk and five minutes for Q&A and I hope it's somewhat interesting to you First the facts the core infrastructure hasn't changed that much Since the last FOSDEM 2020 we're still running on its Cisco ASR 10k for routing ACLs NAT64 and DHCP. We have already Be reused to them several switches that were already here from the last FOSDEM. They're owned by FOSDEM These are Cisco C7 C 3750 switches We had our old servers which are now turning 10 this year They were still here and they will be replaced next year We have done like all the years before everything With Prometheus, Loki and Grafana for monitoring our infrastructure because that's what helps us and running all the conference here and we've built some public dashboards and this We just put it out to a VM outside of ULB because we were running out of bandwidth like the years before And I'll come to that back later We have a quite beefy wide video infrastructure You might have seen this one here. It's a video capturing device. It's called the video box here at FOSDEM It's all public. It's all open source except one piece that's in there and You can find it on GitHub if you try to build it yourself and go ahead just grab the GitHub repo and Clone it these devices. There's two of them one at the camera one here for the for the presenters laptop They sent their streams to a big render farm that we have over in the K building where Like every year our render farm is running on some laptops So laptops sent the streams off to To the cloud from Hatsner and from there. We just distributed to the world so everyone at home can see the talks and Yeah, we have some sort of semi automated revamp on cutting process Those of you have been talking here. Maybe have known as review for years. This is the first time it's running on Kubernetes So we are trying to go cloud native as well with our infrastructure Just to show how all is been held together This is our video boxes, I don't know if you can see it We got those Black magic encoders here that are turning the signals that we get like SDI HDMI Into a useful signal that we can process with our banana pie that we have in there Everything's wired up to a dump switch here and then we go out Like here and have our own switching infrastructure inside those boxes. There's some SSD below here Where we just in case of network failure dump everything to the SSD as well. So Hopefully everything that was been talked about at the conference is still captured and available in case of a network breakdown those boxes Also have a nice display for the speaker so we can see if everything's running or it's not running Which makes it easy for people to operate these boxes here You don't have to be a video pro. You just have to wire yourself up to the box You see a nice Fostum logo and see okay everything's working and and you're done and everything's got sound set out This is like how the video systems actually working. We have All this can be found on the on the github. You don't have to take screenshots for that If you like to see it you can we will tear down this room Afterwards so we can just everyone can have a look at the infrastructure. We're using because it's not being used after this talk You see it's quite some Interesting things to do. This is the instructions that that All our volunteers get when they wire up the whole Buildings here on one day on Friday so They are not here, but they should be given some round of applause because they are volunteers that are doing really the hard work and Building up on one day the complete Fostum. So maybe it's time for round of applause for them. Yeah So here we have the the another thing. This is also on The github rapid way can see like where's your was something coming from we have the room sound system This is what you're hearing me through and we have a camera with audio gears speaker laptops And that's all getting pushed down until it's someone reaches your device Down here. There's a ton of services processing it in between and this is all done with Almost all done with open source software Expect for the encoder that's running in there, which is from black magic design So how's it processed we have a rendering farm. This is the laptops. It's 27 this year For those of you who don't know those laptops are being sold after Fostum So you if you want one you can grab one this year They're already gone. But for next year, maybe you want to have a cheap device You can have them with everything that's on them because we Literally don't care for that. You can have it Because after things been processed after after the Fostum You can see it We have some some some wrecks where we just put them four wise and we have 24 seven of them No, 27 of them. Um, we have some switch infrastructure That process is for use for processing all that stuff and this one's not running out of bandwidth But we coming back to what's running out of bandwidth You might see this mess over here This is our internet and Looks like every common internet on the on the planet. Um, and this is like, um our safety net We have a big box here where all the streams go and um, this will be sent out to bulgaria to the video team Right after Fostum. So we have a really off-site copy of everything So the challenge is for this year dns 64 um All the years we've been running on bind nine since ages and we switched to core dns Just like testing it on sunday of Fostum 2020 Um, we we really saw a significant reduction cpu usage and that's why um, we stuck to core dns since then And this year we also replaced the remaining bind Installations that we handled for all the internal dns and all other recursive stuff that's been used here To to provide you internet access Richie always used to give you some timelines and that's what i'm trying to do as well Um, there were times when it was mentally Challenging for people building up Fostum Um, we got better by year by year by year by doing some sort of automation and getting people used to Know what to do and what to do and have Everything set up before that Um, we installed routers. Um, you see that there's a slight It's it's getting better year of the year this year. We had like um, a very We thought it it would be okay from what we know We just set it up in january and everything worked. We came here On the fifth of january, I think Um Put everything up and it's just worked which is great which gives you some sort of um, things not to care about Because there were other things to care about the network To have it up and running here Took us a bit longer this year than the last last years Because we were playing around with the second uplink that we got We used to have one one gigabit uplink Last week we got a 10 gigabit uplink and we thought okay, just enable that and Play with it and It would turn out to be not that easy to getting up both of the bgp sessions running and Doing it properly That's why it took us a bit longer this year The monitoring was um Also one thing which really helps us to understand if FOSDM is ready to go or if someone has to stay Very very late here. Um The last years we've been very very good at that Basically in january everything was was done Like the last of january, but um, it's january This time it were in the first half of january everything was set up and was running and it worked and Yeah, it was really great because some people actually got some sleep at FOSDM Didn't need to Stay here very long because everything was all Pre-made and it just just go and look at the dashboard. Okay. This is this is missing. This is missing and just so okay Just have them all checked The video build up took a bit longer this year Because of we getting old and rusty at that Also very many new faces that have never built up such a great conference This is why we took us a bit longer and the video team also Yeah, I think they they got the the the least amount of sleep of all of the stuff that was running the conference This was the story so far We closed FOSDM 2020. I was also there at 2020 2020 was really one of the best ones we ever had from from a technical perspective We had everything running via ansible just like one command and then wait an hour till everything is deployed and you're gone Cool have some beer some matti in between and everything was cool Then we had this pandemic Just for me like a week after FOSDM everything went down And we you know, we had FOSDM 2021 and 2022. There were no conference here at the ULB So we had no infrastructure to manage was what Quite okay. We had to do some other things like most of you have Learned that we have a big matrix installation to run and the FOSDM conference and a company and help you with communicating During the conference Then there was this bad thing that the maintainer of the infrastructure left FOSDM with in between these years and so Richie search for someone to Was dumb enough to do that. Yeah, that's me So this year we're back again in persona Sorry Yeah, thanks Yeah So after two years we came looking for the two machines after almost two years Like no one touched them They rebooted one or two times due to power outages in the server cabinet But we had a working ssh key We had tons of updates to install after literally three years I wonder nobody broke into that machines because they were publicly exposed on the internet but Only ssh and I think a three year old or three and a half year old Prometheus installation, which was full of bucks and Yeah um We noticed that the battery controllers the battery packs of the rate controllers have been depleted So this was the only thing that actually happened in the three years the batteries went went To zero and Didn't set themselves on fire. So everything was okay the machines worked Just a bit of performance degradation, but everything seemed to be okay And then we tried to run this ansible thing from the last years and you know three years later Ansible has done a lot of things in the time and you want to use a current version of ansible with that old stuff um You end up like this. This is me here Start from scratch or fix all the ansible roles like there you can have a look at them. They're also on github um so When we we just thought okay, how do we do this and said, okay, then just Be gone ansible be gone. We just fix fix it After the phos them because um, we will have to renew the service anyway and everything will change So so the service timeline we have them service life at the 8th of January services dns64 all the way the middle of January We had centralized all our locks. This was something Richie was looking for since ages that we had easy Lock files for everything that's running here at the phos them Which is with good that we had them because we could see things like oh The internet line that was proposed to be there actually came We did nobody told us, but it came up. You see that we See thanks to the centralized logging. We were aware of things like that and then we could go and fire up our Our bgp sessions Then two days later, we noticed okay firing up the bgp sessions wasn't that a good idea because we lost almost all connectivity Stop it says, but I don't care. Yeah, I just keep I just keep talking. Yeah um We lost all our connectivity and said okay damn it We're in in some sort of panic mode because The reason for looking at the service was that was like this bind security issue that was been I read the mail at the the morning of January 28th Morning of January 28th and said okay We have to fix the bind installations and then suddenly can't reach the service anymore and said okay Are they already hacked or what's going on and? Doing some back and forth with our centralized logging. You see that this is grafana low key that we that we leveraged for that we were Kind of like yeah, it's been It's been really nice to debug things like that We also noticed that there was a interface constantly flapping to our backbone which we also could fix within that session and after that we Said okay. There are some mtu problems. We Have so have to restart bgp and so on and back and forth and then we finally agreed to just throw away the bgp bgp sessions Go with the 1 gigabit line and yesterday evening we switched to the 10 gigabit line because we had the congested uplink like Since 11 in the morning So many people using so too much bandwidth And since yesterday evening everything is okay. It's better and we are on the 10 gigabit link Due to the fact that there are not so many people here today. Yesterday. They were quite a bit more The link was not fully saturated, but you can you can tell we This is the place where we could use some more bandwidth was like I don't know. This is usually time for something to eat, but at 3 30 we could actually use something of the new bandwidth that we had available So If you want to look at all of the things we have a dashboard put out there Publicly if you want to have a look at the infrastructure and the ansible rapport that will be fixed to work with current ansible versions within the next few days Just clone our infrastructure clone everything and if you have any questions I'll be glad to take them Yeah fire away As I don't see any questions then we we're about to tear down this room after this so please Don't leave anything in here because it will be cleaned and everything will be torn out If if anyone else has a question just there's We use lap the question is why do you use laptops for ending because if they have a built-in usb called battery So in place of the power outage we can easily run with them Also, they're very cheap for us We can just use the computing power and sell it at the same price that we bought it to the people here You get a cheap laptop. We get a some computing time on them before and That's the main reason for running it on laptops Well, actually the question was why you were using banana pie That's a good question. The thing is that the capabilities of the banana pie were A bit better than the raspberry pi the times the decision was made if you see there's a big It's a big lcd screen in front of the boxes where you can see that thing I think it was with driving those lcd panels and also the computing power available on the banana pie that wasn't Yeah, but actually we have to look that up in the in the repo. There's everything documented Okay Yeah, yeah, there's another one in the front So the question was is there are if there are any public dashboards out there Yeah, we've put some public dashboards on dashboard dot graffana dot orc with oh dashboard dot Fasdom dot orc. Sorry Which you can have a look at the infrastructure We used to have some more dashboards like the t-shirts that have been sold but due to the fact that we changed the shop We converted to something that we bought to an open source solution and The thing is we totally forgot to monitor that so that's But there are some dashboards out there to monitor it and if you want to have something to see something more Just come come to me after the talk and I'll show you something more here at the laptop. Okay Yeah another one The biggest one standing here No actually the the biggest issues we had was like Running all that stuff after three years and and Not having set up everything properly was quite challenging like on saturday morning. We had to run and redo the whole video installation on on the k building because of You see those transmitters here. They were not plugged properly. And so we had no audio on the stream This was one thing and then another very challenging thing was like when we played around and as I play we did not engineer anything properly When we played around with the bgp sessions, it was not clear how long it would take till things distributed to the whole net And we were literally just trying to get Information is it working is it working not and till this bgp information propagate from here to the rest of the planet like brazil It takes quite some time. And so you can't be sure that you're setting up bgp session. Everything works because um Should we'll hit the fan after 10 20 30 minutes and not instantly and so it's quite Um, it's quite a problem to have instant recognition if things are going well or not So the question was um, if the problems with the wi-fi that we had here on on site were due to the rbgp playing or was it due to Something something something else solar flares or so Thing is that we had some issues We're we've been given access to the wlc the wireless controllers You see these boxes over there. They're centrally controlled and we have to dig in that We have some visibility of the infrastructure that's owned by the ulb They've given us access to that so we can engineer that but we're not quite sure. Why was that? mostly most of the time Fossed them which is an ipv6 only Was working quite good except for some apple devices That do tend to just set up an ipv4 address even if there's no proper ipv4 and things get complicated and Fossed them dual stack which is dual stacked um Usually worked for most of the apple devices But we're not very certain me Yeah Yeah You will see that There's another one So the question is if the live stream live streams will be made Yeah, this week Rewindable or or not I honestly, I can't tell you that I don't know I can ask the video guys if they're planning that for next year But there's no plan of that as far as I know the biggest challenge was to To redo things with hdmi over vga, which we had the last years this But there's another one Yeah So the question is that we're planning to use servers. Do we know what and what's planned for next year? We'll have a talk about that next week I think and then we go through the post mortem Which is usually a week after frost them and then we decide on things to to be bought for next year because switches are old and Routes always all Also older thing and just We have one more year on the route to go that should be fine for next year But what what after that? We have to make some decisions and some investments for next year to run this stuff and this will be done next week when we're all bit cooled down and refreshed after this foster Anyone else? Yeah, come So the question was what In what are we really what part of the infrastructure are being reused and what do we bring for the event? Well in numbers it's I think it was three truckloads of stuff No three because the the video arrived there's a second We bring mainly cameras and those boxes here switches stay at the ulb most of them stay here, but but The one that didn't stay here They won't be here next year because ulb is planning to do some some tidying up and giving us here some some video ports for our vlands. They're very very Good at working with us. We get Access to most of the infrastructure that we just say tell them what we learn We would like to use and they just throw it on their controllers and bridge it to our service and we can Use it and make fun with it And they will be replacing part of the The network infrastructure next year and we then will have to bring even less gear here Yeah Which one first Yeah So the question was what's about all the other stuff that fosdham is doing through the year Um, do we host it on our own hardware? Is it in the cloud or somewhere? We used yeah, we have another company called t-grown here It's a belgium provider and there's most of the stuff is running at t-grown during the year During fosdham. We also spin up some vms at hetsner in germany and they they are only for during the event and Short time after the event So like cutting videos and so on in the cloud and they will be turned off like Two or three weeks and then everything is running on t-grown on our own hardware there as well So there was another question So the question was what is being used for the communication between volunteers? We have that matrix set up. Um, I don't know who's aware of matrix. It's a real-time communication tool like Yeah, like slack or something like that We use matrix since 2020 internal for our video team For for communicating and then we expanded that for 2020 and then with the pandemic we opened it up for all of the people and now the volunteers are being coordinated to that And we also have our own Drunk chair still um that we have here. Um, especially for this event setup and volunteers are also Can can be reached via those radios Am I correct volunteers? Yes. Yes. Okay. We have two volunteers here. So yeah, we'll get to Yeah Is there anything else you want to know or any of what? Where's the money? The the the question is where's the money? Lebowski? That's the real phrase from the film. Um I don't actually know i'm not yet the member of fosterm stuff. So You have to ask someone in a yellow shirt There happens to be one next to me. It's just from the microphone We have a money box And a bank account anyone else three to one. Thank you very much