 More than once today, I have mentioned that GitLab is a DevOps platform that can replace multiple tools from your DIY tool chain. Today, we are excited to have Jean-Baptiste Kempf, the president of the Video LAN organization, share the story of their migrations of VLC and other projects to GitLab and the tools they were able to replace along the way as they simplified their tool chain. Now, no journey is without its ups and downs, so he will also be detailing some of the challenges they faced during that journey, how they worked through those, and ultimately how they wound up where they are, and what their GitLab roadmap looks like going forward. So let's hop in and see how it went. Good morning. We're here to talk about the migration of VLC to GitLab. My name is Jean-Baptiste Kempf. I'm the president of the Video LAN nonprofit organization, and I've been contributing to open-source projects around multimedia for the last 15 years. I've been leading the VLC project, one of the main developers there, but I've been managing everything that is not only code-related on the Video LAN nonprofit. What you have to know is that Video LAN is a nonprofit that has several projects, a bit like Mozilla and Firefox. VLC is the main one and the most known one, but it's not the only project that we are hosting. And Video LAN is also hosting infrastructure for other third-party projects like FFMPag and many others. So what is Video LAN? Video LAN is this weird organization that wants to have traffic cones everywhere, and it started as a student project in the university in France in the south of Paris, and the main goal was not to create VLC. The main goal was to work on a solution to stream video on a local network, and that's the name Video LAN. It became, of course, known for VLC, which was the client part of the streaming solution, but it was not the main goal at the beginning. And even at the beginning, it was not even supposed to be open-source. The open-sourcing of VLC came 20 years ago in 2001. Since then, and it's quite important those days to understand that, it was developed by students and then young professionals who were working on VLC and other open-source projects on their free time. So we're very far from the corporate open-source that we have today. We have a team of people who are passionate and are working on their free time because they like it. Of course, most of them were students, but since maybe the last eight, 10 years, it moved to basically normal developers around the world. And it's important that the Video LAN nonprofit is not employing anyone and doesn't have the resources to employ anyone. So we're really like one of the largest small projects of the open-source community. We organize a lot of events with other open-source projects, and that's where we discuss about the future and where we go and where we move on the roadmap. And of course, everything related to hosting and infrastructure is a big project because that's very important for us and for any open-source project. So as you can see here, the origin of VLC was something that was able to stream on a multicast network, anything from DVDs, video files, graphic cards, capture cards, or even satellite, which was the beginning of the project, and stream that on a network and have some clients that can basically play the streams. The client, Video LAN client, which became VLC, was one part of the solution. But there are many of the tools and those are tools that still go on. In 2001, VLC became open-source and it grew and grew and grew mostly because it was able to play any type of files, any type of video, and all those codex packs that you used to know in the early 2000, which were very difficult to use. And VLC arrived and was basically a solution that just worked because all the codex were inside and has also other advantages, mainly because it was able to play broken files and also because it was able to run everywhere. A lot of those features in VLC came because it's a set of a small core and a lot of plugins, and those plugins are very important because that's how people extended VLC. Because the core team of VLC was always and has always been very small, but there is a lot of external contributors that arrived and add new formats, new support. And in order to be able to do a good contribution, the way that new developers arrive and contribute is a very important path for us. Today, VLC is around maybe 400 or maybe 500 million users. We don't know, right? Because we don't have any telemetry or spyware. So we don't know exactly how many people are using VLC, but we know that it's around those kind of numbers, possibly 300 to 400 million on desktop and around 100 million on mobile. That's quite large compared to other open source projects. And the open source projects that are in the same range are all professional. And if we count the number of downloads on our website, which is, of course, a smaller amount of all the downloads of VLC, we are around that 3.7 billion downloads, which is also quite large. Most of our users, of course, have no idea what open source is or why they're using VLC. They don't do it because they think it's great that is open source. They use it because it works. VLC runs literally everywhere from Windows to Mac to Linux. And the last version still runs OS2, which means that the three developers users are very happy, one of them being the developer. But that's fun because it shows that it's quite easy to port VLC on other platforms. So we have VLC on smartphones, smart TVs, smart watches, on weird BSDs, the derivative or UNIXs. So VLC runs literally everywhere. And one of the reasons why it's possible is because everything is a plugin. So you can basically touch your plugin and write your plugin for your own platform and still have the rest of VLC running. So as I said, Videolan is an umbrella organization for many projects. And we have, so the main VLC is a kind of monorepo where there is a core, which is around 100,000 lines of code. And then you have plugins. There are probably 500 to 600 plugins. And those plugins are around 1,000,000 lines of code. And of course, a lot of those plugins are, depending on third-party libraries, more left, 100 of them. And the total amount is around 10,000,000 lines of code. But the main repository, all the plugins, Contrario from projects like the Gistreamer, are in the same repository. And this core repository, which has basically the VLC core and the desktop versions of VLC, is basically the most important Git repository that we have. But next to that, we've developed a version mostly for Android and related, which means Android, Android TV, Chrome OS, and so on. And iOS, which means also iPhone, iPad, Apple TVs. And those were a bit too different to manage. So those are in separate repositories. But they're very small compared to the main one. And if you continue, we have a lot of other repositories for the Windows Store version, for the bindings in C-sharp, in REST, in Objective-C. And that's just for VLC. And as I said, we have third-party libraries, and some of them were developed by VLC teams as a video-land project. So there is, of course, a full DVD stack, which is live DVD reads, CSS, which is handling the description. But the same for the Blu-ray, which is basically a Blu-ray stack to be able to read any Blu-rays. We have an MPEG-2 stack, which is Codex for MPEG-2. And lately, we have an AV1 decoder called David, which is also a video-land project, but is used in Firefox, in Chrome, in most of the application on mobile, who are using AV1 decoding. And of course, we have also the X264 encoder, which is probably one of the most encoders used in the web. And all of those projects that, of course, are not known to the main public are part of the video-land project, the video-land non-profit. We even host some parts of our infrastructure for FMPEG. For example, we host a Geat of FMPEG. And in the past, we used to host also some binaries for Handbrake and other multimedia packages. So video-land is more than VLC, and the infrastructure change we do on video-land impacts more than VLC, which sometimes makes it a bit more difficult to evolve than we would like. So how did we do things in the past? One of the things important for us is that we only using open-source projects, right? So that means that everything we're running on our servers are and must be open-source. And that's how we grew. So how did we do that? We were using mainly for reviews and patches, like the old days. We've been using Geat since 2007 with GeatWeb, mostly to review it. So we moved to Geat quite early compared to other projects. As a bug tracker, we use Track from Edgeworld. As Wiki, we use MediaWiki. For the CI, we use Jenkins for basically reviewing and tracking the patch sets. We were using Patchwork. And externally, there were two services that we are not really open-source that we are using. One was Coverti for static analyzing and one for hosting translation, because it's a very tough project. And those were hosting services for the developers. And for our users, we're using PHPBB for the forum, IRC on FreeNode since forever. And even the support for the end users were done on the main list. And the documentation was on our website. So this was quite well managed on basically one or two servers. But it was quite annoying because there was a lack of integration between all of the services and was difficult for the users to understand that they needed a different account on the Track, on the Wiki, on Jenkins. And contributing to VLC became more and more difficult because of the CI. We support so many platforms that it's very difficult to contribute to VLC and not break one platform. Even like we have four architecture on Android, four on Windows, on iOS and so on. So it's quite difficult to have that. So we had to change. And one of the solutions that we discussed was one tool to rule demo, which was moving to GitLab. I think the first time we talked about moving to GitLab was in 2015. So it's a bit weird that we talked in 2021 about that, but you'll see why. So the decision was to move mostly almost everything to GitLab. So at the beginning it was just for Git host and doing merge request, bug tracker and CI. And then we decided to move also the Wiki, then work on the static analyzer on the user support desk and even like move the documentation. And then basically we have GitLab for almost everything. And then outside we just have IRC for chat and the forum and still the translation outside because there is no simple way to do that. So how did it work? We started with the Android project, the VLC Android project, which was mostly Java and now Kotlin. So in 2015 we moved this one directly to GitLab. That was the first project we moved. And here we were just using it as a normal Git repository and a bug tracker. Very easy. There were three or four people developing around VLC Android, so it was easy to synchronize. In 2017, early 2017, VLC on iOS which was hosted on GitHub moved back to GitLab for the same reason. The reason we didn't want to keep it on GitHub was because GitHub is not open source and that's basically a contraria from what we want on the VLC and non-profit. So we moved VLC on iOS for the same like Git hosting and bug. And when I mean Git hosting, like we did not use merge request. We're like pure Git. In 2018, we moved, end of 2018, we moved Android to merge request. And one of the main reasons why, because we managed to have a correct CI for Android directly integrated in GitLab. Before it was a bit difficult because of our build system and because the GitLab CI was a bit shaky for Android. And once we were able to do that, we moved basically the VLC Android to normal merge request. So people fork and on their repository and then create merge request on VLC and Android. It went well. So we decided to continue and when we started a new project which was David, which is the AV1 decoder now that is used almost everywhere, we decided to have for David something that was full GitLab. So there is not even a mailing list which we had on VLC on Android. Everything is managed on GitLab. And a lot of people from the open source community, from the VLC were basically complaining that moving to GitLab was not possible, it was too difficult, it was too slow or it was going to be impossible to manage large projects, a large number of merge requests. With David, we said, well, we're going to try with a smaller project and we're going to do only GitLab. And we have just a nice CI channel and the David GitLab project and everything works fine. We have issues, we have merge requests, I think not a single direct commit. Everything was done on two merge requests. The issues were done since day one over there and basically downstream Chrome developers or Firefox developers directly contributed there on GitLab. It works absolutely fine. And since quite early, we had like a very extensive CI on David. And once we've done this CI on David, then we moved, we've added the CI for VLC on GitLab, which means that we were still pushing on a normal Git, on our old Git host. And we are basically synchronizing the repository on the VLC project on GitLab and running the CI. So, of course, it was not as good because, like, of course, we would see that it breaks afterwards and not before, which is very bad. But at least we was able to give us the capability to check and to see what we needed. A few things were also missing on GitLab, so we reported some bugs that were fixed. And so that's how we moved basically the whole VLC to CI. And then most of the video and project moved in the same way during 2019 and 2020. In 2020, we started to, like, actually tackle the big part which was moving VLC to GitLab. The biggest issue after we moved the CI in 2019 was basically the bug reports because the issue tracker of GitLab is quite light and because it doesn't match what we had on track and because some developers were still very concerned about how we would manage all the merge request without using the main list. So, the issues we saw very early is that we needed to be self-hosted because we want to use only open source projects. We discussed with some people of the GitLab team and opened a ticket about the issue we were seeing. There were a few bugs on CI. I think that were fixed, most of them. But as I said, the biggest issue was the bug tracker simplicity. Like, it's an issue tracker, right? It's not a bug tracker. It doesn't fit many use cases and it's still too simple for most of the use case we had and we are going to see why. And as I said, the other issue was the merge request approval process because as soon as you start moving to merge request, we saw that quite early on, David, you have the tendency to want to merge as soon as CI goes and one reviewer goes. But that's problematic for a larger project because then you have a kind of group of developers that are like acknowledging each other request, merge request, and then merge that. And we didn't have a solution for that at the moment. And finally, GitLab was, still a bit, very slow on large merge request. And finally, some of the developers of VLC and FFMPEG are using and doing everything on command line and GitLab requires a web browser to do many things that are not able on the tool. So why do we need, why is a bug tracker too limited? Even compared to something that is extremely old, like track, the reason is that we need custom fields. I know a lot of people say that, no, you don't need that. But yeah, we do need that for basically two things. The first thing is components, right? Because as VLC is a kind of monorepo with a core and 500 plugins, you need to be able to take which type of component does it require? Because not everyone knows about the whole code base because the code base is quite large. So we need to be able to say, well, this is related to the video output or the audio output. And using Labels is clearly not enough because you have so many of them if you do that. And also like track is a very broken software. So migration from track to GitLab was difficult. There was a lot of bugs post on the track and the GitLab side. So it took us a lot of time. And the other part where we need some custom fields is about the platforms, right? VLC is not a web application. We have 20 platforms, right? So sometimes you need to categorize and the developers who are working on macOS are not the same one working on Linux or on Windows. So in order to triage, you need to be able to do that. And again, Labels are not enough for it. So how do we do that? Finally, we use the scope labels as you can see here in order to categorize. At least the scope labels is a kind of enum. So it's like it avoids conflicts who have been tickets that are on two components. So that's a bit like custom fields, but it's too limited in terms of you cannot search, you cannot have a default, and you cannot search about like I want to search for the tickets who don't have any component or stuff like that. Those are way too limited. So, and then the biggest issue for us is that the custom fields, the scope labels that basically was able to work around this problem of custom fields is not available in the open source edition, right? So we had to migrate to ultimate that was sponsored by GitLab. And we had to go through a non-profit because even if the source is available, it's technically not an open source project because we're using the high end. So finally, we voted to do an exception because we really needed to move to the bug tracker and use the scope labels. But we did something very weird is that we basically disabled every other feature of GitLab. So we went and we basically disabled all the features that were on the ultimate editions and the premiums edition and just keep basically the scope label. I still find very weird that the custom fields are not available because Track, Redmine, Buxila, Jira and all of those are managing that. And lately, GitHub has announced basically that they're supporting some custom fields. And the problem is that of course, scope labels are okay, but then you have a lot of labels, right? You cannot search. You cannot, you don't have defaults or like you cannot assign a default component and so on. So the scope label is top gap, but I think we, there needs to be something stronger in the future. So that was for the issue tracker. Writing the script was long, but it's open source. People can use it and other people have already using to move from Track to GitLab. The second issue we had was basically what we call merge request approvals. It's not a new problem that raised when we moved to GitLab. It was already existing. But when we moved to something more automatic with CI, with bug bugs, linked and so on, it became obvious that we needed to tackle that. And the problem is just like basically developers fighting, right? It's not really fighting. Of course, it's more like disagreements. Not everyone is merging at the same rate. And so it's very easy to open a merge request, see that it goes through CI and one hour or two hours, some developers say, yeah, that's okay. And then you move to that. There is a very complex system on the on the premium edition of GitLab. But of course, we don't use that. So what we wanted is to find a way to work around those community limitations, which are not only a technical limitation. And that's how we decided to create Homer or basically bot. It's called Homer because there is a famous bot around that is called Margebot. And this was based on the same code base. So it's called Homer. And it's a clear definition of rules about who can merge and when. Because whether the merge request comes from a developer or an external contributor or to know when you can merge your own merge request or like how long do you need to wait before merging your own merge request? Because it happens that you're basically in some small communities like ours around 10 people working full time, well, related to VLC on the core VLC. Sometimes you open the merge request, but no one can review because basically you're the only ones who knows. So when can you basically decide? So this bot is basically here to have a set of holes and basically says when you can merge or not. The thing is, of course, we could use that with some strong approval process that exists on GitLab, but we needed something to be flexible. And what I mean flexible is something that is a bit lighter than a full approval system, but also able to force it, right? Because sometimes, well, the rules are not you don't have enough developers and sometimes you need to basically go around the process. And that's why we have on VLC, we have a technical communities who takes decisions, but we needed something that was quite light. And that's why the bot is basically just labeling the merge request depending on the process. It has a few issues, but nothing very important. So here you see basically the graph of the request. So basically when a new merge request and the CI passes, then the bug starts saying, well, you can review. So pending reviews. And then when it's reviewing, it checks that all the threads are solved, that all the big issues that were open are done. And mostly that there is enough plus one, which is sums up, compared to the sums down. And when all those basically configurations are okay, then it moves to the acceptable, which is almost accepted, but you need to wait sometimes. And for example, if you're a developer of the project and no one contributes or anyone says anything on your merge request after three days, basically, you can activate and merge the merge request. But if you are basically an external contributor, you need to have an explicit plus one, an explicit approval before it says this is accepted. And if you're a developer and all the one other developer agrees with you, then it's easy to move to after one day only to move to accepted state. So the bot is just basically labeling this process and explaining this process. It's not merging anything. At the end, it arrives and basically you can click on the rebase and then the merge button. But sometimes there is no solution, right? Sometimes there is disagreement in the community. And so there is no consensus on the merge request. And therefore the merge request arrives and are marked as stale. And then the technical committee can say basically, yes, do we agree or no, we don't agree and solve this issue. So this tool is like enforcing a very light system based on labels. And that's why you see that having like using more labels with custom field instead of custom field means that you finish with too many labels and that's a problem. So we moved to merge request. We use this bot. It works pretty fine. And the community, of course, there are bugs in the bot. But the community agrees and likes it. It's interesting to know that the bot is a stateless bot. It doesn't have any database. It's just basically using the APIs to label. So sometimes there are bugs because of course some states are a bit weird. But that's a very simple bot. And it's very not intruding in the process. So what are our next steps for our migration? So the next step is the one that we started a few days ago, which is using the service desk. Because as I said, we're using mailing lists to answer to our users. And of course that's not the best way. And mostly it's difficult to see if someone has answered. Many people are forgetting to reply to everyone and so on and so on. So we are moving to service desk to be able to answer to our users. We are starting also soon the migration of our wiki from media wiki to GitLab. The wiki from GitLab is as less features, but it's easier to maintain for us, mostly because it's based on Git. And finally, we have some work to do on the static analysis. We've done that a bit, but it's shaky. So we need to move more projects on the static analysis instead of using a coverty as a next model. There are a few limitations in issue we've seen during this whole migration. I've talked about, of course, the issue. We've talked about a bit the approval, which is not only a GitLab issue, but basically a process issue. But there are a lot of features on some fringe features that are not that are blocking quite a bit. For example, the service desk doesn't have recorded answers, while any service desk is able to do that, which feels weird because you're answering to your users always the same things. The API is not always complete. For example, if you click on an app proof button, you have the timestamp. But if you do sums up and sums down, you don't have the timestamps. So basically, when we do the bot, there is some weird cases that don't work. And when we wanted to do like more complete bot, the API was not as good as we believe it should be. There is a big question about CLI, the fact that a lot of things are only doable on the web interface is a limiting factor. For two main reasons. The first reason is that the major request, some large major requests are extremely slow. The web viewer is very, very slow, especially when you have to pass assembly or review assembly in the web browser. It's quite tricky. And for some projects like FFMPag or David, which are done by people who write assembly by hand and see they are used to use CLI. And most of the tools on CLI around GitLab are limited compared to what the competition has or what we expect. So when you move from a pure mailing list using Mert and IRC and review everything over the email and acknowledge the issues over the emails to something with GitLab, some of the developers are frustrated and don't want to go on. So for example, we are discussing with our friends of FFMPag to move also to GitLab. And the lack of CLI tool is basically the highest issue that we have. And also one of the things that is missing, one of the large pieces that we still use as external is a management of translation. And this doesn't exist yet on GitLab as far as we know. And I think that I spent more of my allocated time. So you've seen basically how we managed to move video land projects from VLC, but in other video land projects to GitLab works pretty fine. And we moved from 10 tools to one. And I hope we are going to remove more tools in the future. Thank you, everyone.