 So, my name is Jean-Baptiste Holcroft, I'm a G-Backfed, is my nickname, and Fedora, and I'm contributing localization for a few years now. What we did in the localization team recently was the localization of the documentation, the migration from Zanata to WebLate, and now the new big project is providing language measurements in Fedora. And we went very lucky because we were able to go far in the past because the packages were still available in the mirrors, so we were able to run this system on old releases. So I will do it in a very, very simple manner, just explaining why we do this, explaining the difficulties that we had and the challenges. Then I will show a little bit about the results and tell you a little bit what you can do to help us, to help everyone to have a better localization in the Linux world. So if we wanted to work on measurement is to be able to understand where we are at the global level. The user experience is at the operating system level, so we had to measure not only what the Fedora community is doing, but the whole open source community, the whole community and people contributing to the Linux ecosystem. And also it should help us to reduce complexity because localization is a combination of a lot of layers, technical things, and codifications with complexity in how to handle each language. For example, right to left writings, the way to handle players, it's a complex world, so we try to make it easier to understand so we can start thinking about how are we moving forward, how are we evolving over time, is our global community in good health or are some languages going down in terms of contributions. So the challenges were many, but the biggest one is Linux and Fedora specifically is a distribution, so it's a combination of thousands of software. So there are multiple ways to store localization files in each project. Sometimes it's in the same Git repository, sometimes in another place, sometimes it has its own release cycle. The language codes are not harmonized, meaning depending on the project, you can use a single two-letter code, so you can go up to three or five letter codes plus the way to write the writing script. The file format are not harmonized at all, so it means that each file contains some information about the language. Sometimes this information is in the name of the file, sometimes it's written inside the file, sometimes it's written using a code, sometimes it's using the full name of the language, and it was very complex. And of course, if you want to make a computation of data across 20,000 software, it takes a lot of time to download everything, it takes a lot of time to compute everything, and of course, when you have a bug, you don't want to restart from scratch. So I will not go into details and how we did, I will go into detail on what are the outcomes. We are not the first one to try to do it. The localization team in Debian also does this. It's using some Perl and the content is a little bit obsolete because it covers, it detects translation file in a very, very basic manner. In Ubuntu, they do the same, but Ubuntu is doing some downstream contribution. It means what you do in translating in Launchpad most of the time will never reach the upstream project. This is a philosophical difference that we have in Fedora, we always contribute upstream. And in Fedora itself, we also have a project, which is named Transstats. Transstats is a very interesting tool that focus and there is multiple places where the translation are, and is it synced together and do we have all the language, do we forget to push some languages to the package, but to my opinion, it's a little bit complex, but it's way nicer that what we did with translations that Fedora project.org. So let's talk about the results. So as you can see here, you should see two little graphics. So on the left, you see the number of packages in Fedora. So with Fedora 10, 13 years ago, we had like 6,000 packages, and out of that, we had almost 1,000 packages with actual translation detected inside it. And in Fedora 34, we now have more than 30,000 packages. And as you can see, the trend, the number of packages that have translation is way lower than the trend of the number of packages in Fedora itself. It does not mean something very negative because all packages don't necessarily need translation. For example, we have wallpaper packages. We don't translate wallpapers. So if there is no translation for wallpapers, it's no big deal. But still, you can see that the number of packages with translation increased. I think in Fedora 34, we have 2,300 packages with actual translations. But the most important is on the right and the right-hand side. You can see in Fedora 10, we had more than 5 million words to translate. And now we detect more than 25 million words to translate. So if you want to translate this as a human, and if you are alone during the translation, it means it will just take you 12,500 hours to do everything, which is impossible for one single person to do it. In the 25 million words to translate, there is a lot of different content. Some content is specific to the user interface. Some content is related to the documentation of the software itself, or it can also be manages and etc. So the fact that we have more and more words to translate means that the coverage of what we can translate is increasing over time. It means we can do more translation of main pages, we can do more translation of documentation that can be inside the software itself. Like for example, the first time you launch GNOME, you will see a welcome page and some tips about how to start on GNOME. This is a documentation embedded inside the software. So the amount of work required to translate is outstanding. And as you can see, the number of language available in Fedora Linux is increasing over time. It means even if it is impossible to reach 100%, there are still people who want to promote their language, who want to translate, who want to contribute to move. Okay, they won't have the whole operating system translated, but it's no big deal. They will still translate the key software for them, and one by one they will slowly provide a good user experience for non-English natives. In this website, I wanted, it was very important to me to show the diversity. Often there is a misunderstanding between a language and the territory. The territory is just a political border that has some rules inside it, but it can contain many, many, many languages. Here for this website, I reused some data coming from the CLDR, which is part of the Unicode Consumption, which contains a lot of data. So you see here the Russia territory. Russia contains 141 million inhabitants of population, and you can see that there are a lot of languages. Some of them are official, like Russian, but you have a lot of regional official languages. So the percentage that you can see here is based on the total population. So even if you have 1% of the population talking a language, it still means more than a million people talking this language. I wanted to highlight this because we often forget that Russian is spoken a lot out of Russia. French is spoken a lot out of Russia. And on the right hand side, you can see the list of languages which are available in each language. So you can see on the top the list of languages for Fedora 34 and the list of language for Fedora 10. So it increased a lot and it always increased over time. I don't go in all specificity of language codes and etc. I just wanted to highlight this fact. For each language, I try to list the territories in which this language is spoken. So here you have French because obviously I'm French by my accent. You could have guessed. You can see that French is spoken in many countries. Whatever this official or not official language, it doesn't matter. So you have some more African countries, but you have also North America with with Canada. You have and you have a lot of countries that are impacted for French. So when you contribute to French language, you can help people who actually are in in the Canada or in Africa or in Belgium or whatever. Whatever it goes way beyond the borders. So something very important is about language progress is the measurement. You can either measure the progress of French translation on what is started or measure the French language translation across every single packages that can be translated. So the French translation progress is 83% for what was started and is about 39% if you compare it to every single translatable string in Fedora itself. You can see there is a single script language. It's Latin, but you have some language you can do. You can script and the progress is based on the 25 million words that are available for translation. So this is a major language. We have a lot of French people contributing to translation. And so there is a few disclaimers to do here. When we compute the translation statistics from now, we only focus on get text, the PO file. And there are other translation files, but we don't cover it for now. And when we do the computation, we don't try to guess if it is a man page, the documentation, or if it is the user interface, or is it the priority string in the user interface, or is there some very hidden screen at the end of the user interface? We don't try to do that. If you want to see that, you can go in GNOME and they do it very well. There is no need to try to do better than them. And each project is providing its own measurements. Of course, there are some bugs that can be either upstream in the file format or in our code. So we should be quite close to the reality, but still there are some limitations. And what we will have to provide in future releases is the language health. We are able to do the computation for languages across releases, but we are not yet able to provide an easy progress rate for each language because it depends. There are some new packages that are added, some packages that are deleted from the federal repositories. So there are still some work to provide useful health measurements. And there are a lot of details to improve. Like, for example, here, the list of territories provided is using some CLDR codes, which are not very easy to read. So you can guess that BF is Burkina Faso, of course. And there is a lot of traps here. So a thing that this website provides is some translation memories. So these are basically huge files containing all the translations that were done in Federer. This is the compendium. The terminology is an attempt to find recurrent translations of the same word to build some kind of dictionary, translation dictionary. And the translation memory is a very generic database that is written in XML that you can use in many translation platforms or many translation tools to help you to go faster when you translate because there's a lot of translation that exists. You can reuse it when you are translating a program. So as far as I know, there's no other organization providing these files to translator. So we hope it will be very useful for translator to be more efficient or also to detect some errors because, for example, some languages like to translate everything like French, we like to translate email because we don't like to use English words for concepts we use every day. So normally you should be able to use this file to spot some errors and stuff. So this is an overview of the results. And there are a lot of things, there are little things that you can do to help. If you are a developer, you can help us as translators to translate. If you can make translatable your software itself, you can help to make translatable the main pages, the websites, the documentation. And for this, just keep it simple. Use standards, use get text, purifies for everything. And if you need to convert some content, like for example, what we do with Askidoc and the World Fedora documentation system, we use PO4A. PO4A means PO for anything, which makes it way easier to update and keep the content easy to translate. And of course, there are some very good tools. For example, if you have to make a website, there is the Hugo Static Generator, which handles localization very well and is very easy to personalize. So just be careful to use the good tool to make your life easier, to make translators life easier. And to make it easy for us to contribute, just try to provide a suitable tool that allows us to work as a team, translation team, and to allow to automate all the dirty works that we have to do in a Git repository to update the file. So Weblate does that very well. That's the reason we choose Weblate for the Fedora community. There is many hosting possibilities that are free. You can be hosted by Weblate itself. You can be hosted by the Fedora project. There are some other instances in OpenSusie, or you can go also on Documentation Foundation. Don't try to limit languages, whatever the size of the translator of the community, whatever if there are only 10,000 speakers in the world. If there is someone who is willing to do the translation, just let it do the work. And you may eventually set a percentage level to approve the language. Like I want at least 50% of my user interface to be translated before included upstream. Okay, this works very well. But please don't make pull requests. Pull request and peer review is awful when you do translation. Move this work to the translation platform. Again, you can do it in Weblate. You can do it in TransFX. You can do it in the translation tool. It's way easier for us to do this. And it lowers the complexity because doing some pull requests with technical knowledge, which most of the translation contributor don't have. So no pull request, no peer review, do this in the translation platform. So in Weblate, for example, you have a log of everything happening in the translation. So even if you don't control it before publishing it live in the software, you can still access everything that was done. A little bit like the mindset of Wikipedia. And it works very well. And of course, if we do translation, but there is a release once every five years, our work is not reaching end users. So try to release often. Or if you want, you can make release regularly, like the GNOME does. They have a release every few months. It's not very fast, but it's perfect for what we do. We know when is the next release. We know when it will reach end users. It's easy for us to plan our work. Because if a translator doesn't know when you do the release of your software, it can be a little bit frustrating or say, okay, we'll do this one later because there's no rush. I don't know when there's a release. And you make your release and there's no translation in it. And then the translator will come to you and say, why did you do this? And you forget to send us an email. So keep it easy to release often. Even if it isn't just translation release, no problem. But please do it. So I wanted also to thank all the people that are providing language support in allowing it or translating in a daily manner. Special thanks to Darknau, who helped a lot with the work and the infrastructure work, the automation, the optimization of execution. And the WebLate team who creates and maintains quite a lot of tools that we use to make this extraction from the detection of translation files itself to the list of languages. And they do a lot of work, which is very hidden. So a big thanks to them. And thank you to the World Federal Community for supporting this change and making translation a subject of an important subject and allowing us to work on this. So I think it's the right time if you have some questions. And you can write it directly in the chat. So I'm trying to keep track of everything that was said. So question from Justin Florey. Did the Federal Community translate 25,000 words in Federal 34? No. The Federal Community, and it translates a few software. We do translate DERSEDU, we do translate DNF, we do translate Anaconda, but this is just a subset of what is provided inside the Federal Distribution. So most of the work is done in GNOME, is done in LibreOffice. It can be the same contributor who contributes both to the LibreOffice project and to Federal, but it's not only the Federal Community. It's the whole Linux community, and operating system. Yes, something very good said by Quentin is in Launchpad you can translate software that no longer release new version. This is something that is very important. Sometimes we have some software which are old or un-maintained and with Launchpad in Ubuntu you can still translate it and improve the user experience, which is quite useful. So what is the 2,000 words an hour? Is it a lot? Is it not a lot, Justin? Yes. So I see there are some computation. So Luna is sharing some... You have desktop use to translate. If you don't like WebLay, for example, you can use POEdit. The GNOME translation platform is not really a translation platform. It's alternate that GNOME.org, which is more a tool to help on the process to submit translation and review it as a team. So it works well. It's not a great tool, but there's a lot of different tools that exist to do the translation. So a good question asked by Quentin. How can we share the translation between distribution? This is a good question. I have no answer for no. The first step is to be being able to produce translation companion, aggregation of all translation. And then the next step will be to share it with other communities. But I think it may come in the future. But for example, in Fedora, we always say upstream first. So the question is if we detect some new translation that are done in Ubuntu, how do we help projects to integrate this new translation into their project so that we get it into Fedora? So it's complex question, and it touches a lot of technical stuff. So, Marie, will you apply the slide? Of course. I will publish it, I think, in the Viki page. I don't know if this is the right place, but whatever needed. So I'm done with this short explanation. I will just show a few links to get the stats. And Fedora, Viki, Elton and to join the Fedora community, localization community. Or if you want to directly translate without asking any questions, you can go straight here. So there is no issues in translating into Fedora or localization or document defederation, Orgnum or KDE, whatever your translation is reaching end user at the end. So whatever you do is very welcome. And to translate in translate.fedoraproject.org, there is no validation required. You just create your account, you select the language you want to translate into, and then you do your work. And if you don't do it well, there will probably be someone coming and say, hey, this is not correct. Can you please fix it? Or they will fix it directly without bothering you. Just remember one thing, don't use automated translation to do your work. Try to translate the things that you use on a daily matter. It will help you to do a better in context translation. It's very difficult to translate something that you don't use or don't understand. And it's fine if you don't use some software. You don't have to use every single software in the world. Just do the things that you like to use. And it will take good care of it. And it will be a great contribution. I think I will stop the video. And I have a question for you, Marie. And is this recorded? Will it be able to share it? Yes, it is recorded very good. So thank you everyone for attending this short talk. And of course, my email address is not example at federalproject.org. It's Jean-Baptiste Ulcroft. You will find me very easily. My nickname is Zhebek Ferdic. It will be way easier to find me inside the federal community. So there's a lot of positive comments. It makes us very happy. Have a nice day. Goodbye.