 I will give you this one. Okay, so, Luigi here, which is something that actually does all the internet work, while people think I do it, but it's actually in the games, he'll talk about translations and digital translation and stuff. So, hello everyone. So, many of you already know me, but just to say a few things, in my work time, I work as a point engineer in Red Hat in the OpenSTAR team, so something completely different, still open source team, so it's fine and nice, it means good to have a context switch. But on the other side, in my free time, even before joining Red Hat, I was a member of part of the PD community, so PD contributor starting with the internet translation team. Since 2005, doing random contribution to various projects over the world was one of the first big ones, and now I'm mostly helping out, I'm still continuing to do Italian translations and doing the proxy for few translators, but mostly helping out with internationalization and documentation infrastructure and code fees related to that, and maintaining part of the documentation chain, so especially in case of tools and the other side there, and unfortunately sometimes, so it's something I would like to do, but I have no time yet, it's doing some probably quality insurance, but that's another totally different story. Going to the topic. So, we're going to speak to you how the translation and the documentation are structured, some general idea of how it's working and with links if you want to go to check later how it's working to more detailed information, and very important what I want to show are some challenges that we're facing right now, or at least what I see from my point of view and things that can be improved. Just to start, the detailed explanation about the technical details based on your internationalization chain are explained in a nice page on tech-based, but few things I'd like to highlight is that we have five-ish internationalization branches, you can find the name, those are the names of the directories on the subversion because translations are still there for some good reasons, especially related to the work that we need to do. So, when you do maintenance work and you need to touch all the translations for all languages, you don't want to have translations in two different repositories. No, see what you've got. And it does not impact developers, so it should be still fine. As you can see from the names, the first one usually is the one that starts the most updated development thing. The stable branch starts in related to stable branch for framework five things and so on. And essentially it's the last branch, which is the special branch created especially for the LTS, Plasma, probably we're going to have one. It will be probably replaced with the new Plasma LTS. Even if there is going to be a bit of overlap, we should be able to just remove the old translations and just do bug fixes from them. This is to give you a bit more details on what truck which give branch. And as you can see, the meaning of truck and stable depends really on the context of what we are talking. So, for example, in the case of frameworks, every representative which is part of framework, the master branch of all the groups which are part of frameworks are provided by trunk, L10, and KF5. And the other branches, the other 18-end branches are not defined frameworks because a framework is branchless. KD applications is a more complicated one, so usually trunk and stable truck master and right now, applications 17-0, 0, 8. Depending on if the application is still based on KDs or based on frameworks, then we could have those two or the frameworks one. In case of Plasma, master its truck by trunk, stable trucks, Plasma 5.10 right now and KDs is Plasma 5.8. And I just took an example of another project like K4Valbum which has the master branch started by trunk, L10, KF5, and there is still an old stable branch on 4.7. There are a few applications which are these stages. We are trying to phase out the old KDs for branches, so for all these reasons. Where are the... All this information, where are they defined? So the source of truth should be the repo with the backup repository, but we have those information complicated also in translation branches. This is for historical reasons. It seemed at some point, and it still happens that sometimes people forget to update the translation branches so some strange things can happen. So we kept a separate definition and we manually update. Of course this is not efficient at all and we want to remove it and have just one source of truth but it's not like something if you just snap your fingers and you do it so we need to consider the cases and check if things are not going to break if something wrong happens. The interesting thing is about the... Okay you can just follow the repository there is a new project done by Adam to get the information from the API. The information there I used to generate the project with XML if you heard about that it's still used in the infrastructure I found by a few streets but the idea is to pour it away from that. And the most important thing the information is the data starting in files called I18 and the JSON for each project in some cases there are some defaults and you can see an example of the definition it's just a JSON file The only exception is that the Plasma and yes it is not targeted there is kind of special special exception so you won't find that in there but it exists in reports or so the Plasma needs to to confirm that and the data from yesterday just to give you some numbers you can go and see the updated one the framework's managed trust like more than 160,000 UI screens which are usually smaller ones but sometimes you have a long description complex ads and more 15 languages are over 90% and that's important even if we have a lot of legacy and some things are not so active anymore but we still the fact that those branches are so there are many languages a high percentage it means that there is some work there and more important we have more than 16,000 documentation screens and those are really, really huge it can be like a paragraph for the session it's quite huge and still we have 9 languages which are more than 75% and with your comparison on the old stable K-Pips for branch we are still quite few screens almost 15,000 UI screens of course there is a lot of obligation but you can see the percentage they are still higher because there are many things which are not if they are still in that branch it means that they didn't move too much so but it's a huge amount of screens and the interesting things is that everything is based on get-tests even the world is not natively based on get-tests get-tests is basically the mostly user format in the GNU world and GNU links world there are a few projects we are not using but mostly everything where you see F5.io it's a good text file it's a simple text format with few you can have few headers to metadata touch it to each screen not going through the details but for example we are able to handle also QT the negative format QT with yes files with some magic we can handle the translation of desktop files desktop live files JSON files for plugins, upstream files and the documentation for coming from .boot later one thing I would like to mention is that a few things are using a tool called PO summits and it's really interesting as I mentioned previously we have some duplications so you can expect for example for Plasma there are many many screens in common between master and Plasma 5.10 there are things changed but I don't have precise number I would guess like more than 70% messages are the same and this interesting tool a lot basically takes all the same message catalog the same PO file for all branches it creates a consolidated file with all the strings for all branches from all branches which is nice and basically my personal experience is that they couldn't manage the entire thing without that in no way you don't use translations basically the only thing is that it's not fully integrated without infrastructure the basic operation that you need to do there are three operations and this is done basically by Chaswell every day after the screen finishes it's run when the screens are clean you need to take the new translation templates the basic that again you have to take the templates and rebuild them the merge operation and this is something this should be done by each translation team so take the new templates and merge them with merge the existing translation with the new templates and finally there is a scatter so take the updated translation in the savvy branch and push them to the corresponding branches and this is also something that each translation team should do so it's not a totally ideal even if it keeps you out of from my point of view the time you spend not checking each branch it's still the more than the time you spend doing the merging it's capital so in the code the market as such and this works both for the get-txt translations and the one based on Qt system in the documentation of the KITNN framework which provides the get-txt support there is a nice programmer guide which is it has content which is specific for that framework but there are also few interesting suggestions on how to code in a way that you don't complicate the life of translators and in the end you have a better user interface so even if it's hidden there it's more genetic and interesting you may see in the code some files called messages.sh those are files can't be run standalone you can just do like messages.sh but they are used to there are many ways to show streets and they are used to define which files which get text and plates so take all the files in this directory and track that in this get-txt template there is some mention in place for desktop upstream jzone files they are under the live directory by the translation streets you may see some other external messages or external messages they are used to manage the desktop live files and translation of my files I already think I already mentioned at least once scripting is the collection of script I don't know the origin of the name but basically it's a collection of script and which does the magic every night during the European night it runs on all branches and for each branch get all the string from all repositories updates the get-txt and plates merge the existing translation with the new term plates and also it has the logic to do something with desktop jzone upstream files and then does the injection of the changes from desktop jzone upstream files into the various repositories back so this is up and down every night recently there is a machine which is more powerful at some point it was taking like 9-10 hours I think now it's like 6 no it's even less 5-6 hours it's really the biggest issue is scripting and this bit is like that it's IO bounded having an IO bounded so if we probably the new system is really more it's faster from that point of view the translation for the week is a different story especially for user-based because some things move on the page and sometimes the documentation there the week is translated directly online there is an extension for media wiki which allows you to translate to mark some parts of the translation and translate them they are directly there you can support and import the translation as the text but not as a template and it's kind of so for my personal point of view in the past there was a connection in wiki translators and the rest of the translation scheme so in some cases they didn't talk to each other I found translation on italian pages and never seen the communication from the person on the italian translation in DC and just to coordinate what are the challenges you may have already understand that there are some issues and things that can be improved scripting itself it's we will need more flexibility so whenever you need to have the support for the automatic extraction of a new format which is not one of the existing one and even when the support for JSON for example had to be added or upstream which are the last things that were added it requires some hollow changes and that's not of course good in long term even in the short term the wiki translations it will be good many teams already expressing the need to have them in the workflow and technically it's not so difficult we would just need the extension many wiki extensions including the export to get text and plates of course that means that if implemented the custom format to the scripting goes back to the previous point and that can be complicated but in general the extraction process is a bit complicated because you saw that we have this in hybrid model you have to do explicit definition for the messages in the code through the messages of the sage you have to say I want to translate this here and there is some DC implicit rules for some other files so it's kind of wild and this is not nice format again from maintenance point to view so the idea would be for example, possible idea would be to move to a new program which probably will backends to extract merge back the translation for various icons file so instead of having the messages the case change file that is magic basically it takes a configuration file and do the magic for all the types they find there basically it's moving to a declarative definition of what is translatable and what not and that will have also release scripts because now there is a lot of guessing that the release script needs to do colors to include in a parable but that will simplify things a lot and that brings to a scaling point, so rewrite scripting which is something you don't want to do in just one shot it's really an issue though so the problem that I explained in the review slide would be the first step it can be done independently and then plan it into and clean up some part of scripting because right now the things I didn't say is that script is a collection and I'm not hitting a lot of batch, Python, Word and C++ programs all together that's why adding things there is not the easiest thing you can do and I don't maybe I have to reach out but I'm not sure I want to know a possible idea would be to like pattern things around build port which is an engine kind of jengling things but we written in Python to write like automation steps just like that of course a lot of effort is required and you don't want to do like and another thing which is a bit painful a bit less than the others because there are less involved but still when the team translate the documentation so script does the magic and does the from dot book to PO files but then the part where you take the PO files the translations you get and regenerate the dot book it's something that each teams need to do it's a generation of the documentation is still possible but there are few cases so I didn't investigate too much even when I do regenerate there are some templates which are not totally updated there could be some English translation that's leaving so something that needs to be redone that's a good side task it's not part of the other bigger changes and sometimes this is connected to susan translations are lost in branches because sometimes translating trust is just unstable and they don't report or apply the translation with other branches most of the teams are ruined in this but sometimes we can miss something and something could be the solution I would prefer at some point that everyone would be moving into something but we need to all the other steps that we've seen before and our translation website is made up it doesn't research any indexing system and that's why the translation takes place on Tuesday morning or Wednesday morning arrives later there is no reservation system that's the reason why you have at some point 5 or 6 different reservation systems which translation teams to define which translator is working on the screen I tried to solve this with a summer call at some point I think my proposal at this point would be just to switch to dominant lines which is the normal translation website it is written in general by the law and it should be easier to adapt to our easier than rewriting everything from scratch I even tried to install it and it should be quite possible but again and online translation the problem at any point wasn't many times there isn't no people really against this but it's not a solution for social issues like that the maintainers is not letting me go in with my proposal translations but it's still it needs to implement a few points that this is a meaning from 2 years ago that in the last week flying about this the tool shouldn't go in the way of when people do migration or changes in the directing the repository for translation and again there was some volunteers that was trying to do a proof of concept but I didn't hear back from that and documentation is a topic which is linked to that because we have translation documentation but the documentation is that when I started contributing to KDE the team was really big and producing a lot of things and there were two academy awards in different times so really long history nowadays not so many contributors unfortunately it's good still in few projects I think let's not go and mention them but few projects the developers still maintain the documentation but updates are still coming if you check for the plasma frameworks and application at least if you see the updates date in the documentation they are not so bad so even with the lack of manpower the few people that are doing a new job bust it the documentation is mostly written in which is an experimental format I expected you should know but just in case script does the magic extracting the the text and plates using some other two ports and the translator as I explained before regenerated documentation for each track and match the website it's highly content because I'm not a web developer but the website is statically generated pre-ordering in the engine which regenerates it and the search engine were written last year I wrote them and they are working so I just need someone to clean up the website the challenges on this side a lot of people do use non-doctrine even if I'm not sure it will increase the number of documentation too much, but still even in many other open source projects people are moving away from Dockbook so a lot of possible an alternative should be my proposal is ASCII Dock because it's closer to Dockbook it should be easier to just have a few steps to go back to Dockbook and reuse the editing to change because you don't want to change or to change in just one shot and but even in this case it should be implemented ASCII Dock is another text-based format like Markdown or RSE and other like this we have another challenge which is coming the regression to the new license which is totally fine the new license is fine there was a discussion about changing to creative form for the new FDL license it's totally fine but I fear that if we switch to new documents we need to do a lot of work to re-license the documentation if we want to still use it and not use it on time so it's time to do it and it will be long and painful but not much to do and the website as I mentioned before the backend was rewritten we need to do the front-end and everything something like that where you can set the language and the different versions on the fly it needs to change the implementation but that's things I'd like to have and even if the backend is allowed to keep to keep all the branches like old version of plasma old version of application that piece of code is missing so there was one reason for writing the script but and I think more or less we're fine and we have some time for some questions those are the coordinates in case you want to contact the team both translation team and documentation team both on the main piece and if you have any questions and here for translation on Monday in case you're interested and if you have questions so thanks a lot