 Good afternoon, everyone. Hope that you are all back from lunch and feeling good Next talk here and the big auditorium is Patrick Miskry Is that right? Karthik Mistry About how Debbie and helped to translate thousands of Wikipedia articles. So give him a warm welcome Yeah, thank you. So I will talk about In this 20 minute that how Debin helped us as a Wikipedia projects to In a translation of thousands of articles So what me I'm a Debin developer since 2008 I'm also Wikipedia since 2012 I was Kind of editing Wikipedia before that, but not actively Then I joined Wikimedia foundation as a developer and then I I saw okay. There are lots of Issues with the my local language wiki the language I speak is Gujarati, which is the western part of India and Then I became a kind of active Developer there and also fixing bugs and yeah so I'm kind of Helping both project so with Wikipedia like What was the problem or actually what is the problem with the wikipedia you will see the English wikipedia recently Crossed like five millions of articles But if you compare the other wikipedia's like local language wikipedia, especially with the very Even with the large number of speakers language from India for example Hindi Gujarati They have a large number of speakers across the world, but the wikipedia articles are very less So we wanted to fill this gap in a easier manner There are way to translate Wikipedia article like either you can copy paste the entire article and translate manually but That was not ideal solution. So We got it we work on that and Got a solution We call it the tool name is content translation and it is developed by the wiki media foundation language team it was deployed in 2014 as a beta feature and in some wikipedia's as a beta feature and Around 2015 July mid-July it was available in it is available in all wikipedia's So it has a three column layout to translate article easily and quickly and there are lots of tools So which makes easier to translate article from one wikipedia to another wikipedia? So what are the tools? The first tool is machine translation, which is We will talk about it bit later. Then there is a automatic reference adaptation. So you don't need to add a reference Manually there is a major adaptation like you can kind of just copy the image there are Categories you can it can be automatically adept wiki data link You don't need to do that and you can save and edit it at any time and come back later It will save on your personal draft And it can be published in other namespace. For example, if you don't want to publish in a main namespace and Like do more work on it and there are like lots of other advantages so we when we started with Initial wikipedia's like we first deployed in Catalan wikipedia and Catalan is a language from Spain and so we got a feedback that There are some software open source software, which is coloper siam and it can be Used to translate to Catalan from many languages so we look into the status of packaging Debian into 2014 it was quite a not maintained bad shape and we talk about with Like using it in wikipedia So we use our own app repo To deploy the packages So we talked to our now sre team and We got a conclusion that we need to have a kind of latest package because There are lots of changes in between the Debian package and the the current upstream package. I Started working on those packages And I pushed into Debian, but There is always delta between two package we wanted to use the latest package so we I package with along with the Debian science team and then upstream and We got a like deployed Many of the package deployed in Catalan and then it got started So there were like kind of concern about machine translation that quality depends on the language pair like there are For example, French Catalan and Spanish Catalan are kind of similar languages. So it is Have a good quality and we had a great success with the Catalan Spanish and French pairs. There are lots of please advantage like Wikipedia as a general like nobody likes 100% machine translation and so there were lots of Concern about it, but we made a tools So easier that it can be easily detected like if somebody wants to Do a hundred percent machine translation? There are warnings and errors within a tool that it kind of discouraged to use all this machine translation so Yeah, so we are working on the the next generation of tool which Makes things more easier. I'm working with the upstream to get more pairs package within Debian so we can use it We also use a Couple of dictionaries. It's not deployed, but yeah, I guess two or three pairs are there Where we can use a free dict package already available and make sure that user has a dictionary options available and We are working on a make the tool more stable and probably the portable to other wikipedia project and probably to the Available for a general wiki media extension. So even probably the Debian wiki can we use it? Yeah, so Where is this tool? Okay, so if you if you're like If you're gonna be any wikipedia, it is available on every wikipedia you need to go to the beta feature and then enable it and We can probably Check if everything works here. Okay. I can check Is it visible properly? Yeah, so this is my personal dashboard of Articles I'm working on right now So yeah, just ignore this area 51 because that was bugged. So it got stuck in a draft for a more than two years so this is the tool how it looks and For example, if I want to Just start this It was just lower the track the work. I have already done I can These are there's something done by the machine translation. So this is the real translation. I have already used and If you want to do the new translation some language with our It's already there But still we can do that. Yeah, it says that it's already exist. We can just ignore right now small thing Yeah, so this is Using the persium. So Then it says it's saved right now and if you Right now we cannot translate the title, but We can do that later. So it is it will be saved on your dashboard And then you can work it later whenever you want So this tool we saw some good number of articles got published It is less in English because English is already large number of articles available, but There are people who translate to from other language to English and There are some statistics, but it may be the better to use some other language as a you can see that In Gujarati the transfer number of article has grown Considering like we only have 28,000 articles in Gujarati Wikipedia. Yeah, so that's a little bit of that Okay, so for every language there is Numbers available as a special page special content translation states and there is a project we have media wiki page Content translation you can see the future and how we are developed and what are the things you we are looking for as Every wiki is a talk page. We also have a talk page where you can Leave your message or any issues with that so Also the Things we want to work is Considering Debian like I also want to work on the more machine translation tools. So if anyone is interested in Getting more tool into the Debian related to machine translation. There are There are lots of package we can work on it so all the packages are maintained under Debian science and If someone is interested with the projects like Google summer of code there are good opportunity of working with the Persian project along with Not only the language pair but also the improving the quality of machine translation in a general so Yeah, that's basically All I wanted to say and also wanted to say thanks to Debian because we in wiki media we use Most of the machines are Debian. Maybe if I don't like we use Debian everywhere, right? Yeah so it's a good use of Debian in a production and Like I'm very happy with the the work. I did along with the upper some people I ask a couple of people from upstream to become a DM. I guess one of Or two of them will apply a pretty soon because they're pretty much a good packaging skill. So yeah and Questions I Somehow miss so is all of this software already packaged in Debian or yeah, it's already package So the tool itself is its web-based tool but And it runs on the Nord J s and Wikipedia like so along with that like We also started working on the media wiki packages because media wiki is kind of fast-moving But we also have a LTS the long-term support softwares So Another guy Kunal Mehta. He is working on the packaging the extensions which required to run the latest media wiki So I guess we are in pretty good shape Considering the media wiki itself also and also these packages which are used in a production in a like all the machine Package Okay, so there is no other question. So yeah Can I ask a little bit of model of this tool because this is only translation or also management of? I don't know links to external sources and so on because this the power of wiki pedia is that we have different Sources so we can link to I don't know press article source of your ideas that this Information came from but at the same time if I'm translating from English to Polish should I leave Links to English articles should I try to find links to some polish or something? Yeah, so this tool will automatically add up the existing article available in the target wiki pedia for example polish for example English wiki pedia as a debian and there is a link to Linux and There is article available in polish to the Linux so it will automatically link to that particular article So you don't need to worry about those links images Because images are coming from the commons wiki pedia commons, which are already available for every wiki pedia and Yeah, so for example reference Some wiki pedia required reference to be available in local language But in that case like we cannot do much because it's very hard to do that so you have to do it manually But all the reference available in source wiki pedia is a will be available in target Okay, pretty good. Thanks. Yeah, thanks so Yeah, we are just over time and If you have more question you can ask me Both email ID and I'm around here till Monday morning so you can ask me Yeah, that's all. Thank you