 Hi Ami, can you hear me? Hello, can you hear me? So hi everyone, welcome to this session. So it's about a small wiki that will be presented by Ami. Ami is online, so I'm going to have him live now. Great, so a quick test, can anybody hear me? Just let me know if you hear me. Can you hear me? If you can, can we say something? I can hear you, can you hear me? Wait a minute, we cannot get you yet. What can I do? We can get you now. Please say it again. One, two, three, four, five. We can get you perfectly now. Okay, hi, great. Hello, good afternoon, Singapore. I'm talking to you from Providence, Florida. I just moved here a few days ago, so it's the middle of the night here, but I'm really happy to be there. So hi, just some people know me from, you know, I visit some community conferences, and some people don't. So it's part of my style. I love languages, I love language diversity, I love mentioning various languages as I'm speaking, so this is a particular language in which I've been very curious recently. In case you haven't yet figured it out what this text says, you will probably figure it out by the end of the presentation. So my name is Amit, and I've been involved in making the world of Wikimedia more diverse in terms of languages for many years. I've been editing since late 2004. I've been involved in all kinds of language diversity efforts in the language community as part of the Wikimedia Foundation staff, and so on. I'll mention these things briefly, but really this is somewhat some personal talk that I'm going to try to show the history, a brief practical history of how language diversity developed in the Wikimedia world since 2006 or so. Also, even before that, but 2006 is a key year, I'll try to explain why. And how can we make it even better in the future? Because we are far from perfect. We are better than a lot of other web platforms, but we can be much better according to our vision. And let's talk about our vision. You probably know our vision, imagine the world and so on. This vision is never complete. It's by definition incomplete. And the page, if you go to the page on meta.wikimedia.org slash wiki slash vision. That page says that this is an aspiration. It's an ambition. We are not sure who will ever get this, but maybe someday who will get in the world in which every single human being can share the sum of all knowledge. Now, as I said, I care about languages. So to me, logically, every human being means every language. So what can we do about this? Before I really go on, I need to give some disclaimers because they're really important in case, you know, to avoid any misunderstandings. So it's just my thought. It's based on my experience as a person. I'm not representing the wikimedia foundation or the language community, or the ones that pick it up, administrators or like anybody else. This is me based on my experience of talking to many, many people from lots of countries who are interested in wikipedia and to speak various languages. And my understanding, you know, as an editor myself. Another important thing to warn you about is a spoiler. What's the end? It's going to be historical, chronological. But what's the end? I will tell you right away what are the three important things that I'm going to suggest as the important strategic directions for the future. And these three important priorities for language diversity would be number one, making dumplings global and shareable across wikis. I spoke about this topic many times. I'm not going to dive deeply into it, but just so you know, right away, I will mention it towards the end. The second thing is the wiki creation process, which is currently very complicated, but currently creating a wiki in a new language is very complicated. And the third thing is a tool for managing missing content. Now, that may be nothing to clear right now, but I will try to get there and I will try to explain my rationale for all the things that I'm going to propose. Now, this is written in a language that is very personally important for me. It's a language that I learned about when I was five years old, and learning about it really made me fall in love with languages. I've been wanting to be a linguist to do something like that since I was five years old. Thanks to this particular language, if you know what it is, great. If not, I will tell you later. So, let's speak a little bit about the very early history of language diversity in wiki India. And let's start from a concept from sociology. Now, I'm not a real sociologist. I have a few sociologist friends, so they told me about this thing called habitus. And so they heard it. I said, oh my God, this describes wiki mediums really well. We are a community. We are described by certain healths, a certain way in which we see the world. In case you haven't noticed it, the title of the slide, which is written square bracket, square bracket, habitus sociology, white habitus, square bracket, square bracket. If you are an experienced wikipedia listening to this, you probably know that this makes a link to a page called habitus sociology. And the link that will appear is just habitus. If you are not an experienced wikipedia, or if you are experienced wikipedia, but only editors and official editor, then maybe you don't know this. So I explain this. Now, for a lot of experienced wikipedia, this is how they see the world. They see the world basically through wikidex, through the wiki syntax in which articles are written. Another important concept in sociology is community of practice, which is quite related to the habitus. We are used to this. We are used to seeing article histories and versions and all kinds of tools for everything and so on, blocking protection and so on. And I'm going to try to ask this important question. Is this the thing by which we really want to define our community? I mean, it is an important thing, but wikipedia doesn't exist without this. But is this the most important thing by which we want to define ourselves? This practice of writing articles in wikidex or whatever and checking versions and so on. Maybe there is something else that we should be talking about. Maybe we should ask ourselves, what is wikipedia? Because it's actually quite hard to define what it is. You should think about this intuitively when we see a wikipedia in a language. You can see, okay, this is wikipedia. It exists. We obviously know that wikipedia in the English language exists. It has existed for more than 20 years. I know the Hebrew language and I know that the wikipedia in the Hebrew language exists. When I joined the Hebrew wikipedia in 2005, it had about 30,000 articles. And people were saying, you know, I'm actually, earlier, I was not sure that the Hebrew wikipedia actually exists. It existed physically, but it had very few articles and very few people writing it. I wasn't sure that we will ever get a real wikipedia. But now, in 2005, when we have 30,000 articles and when we can have a meetup to which more than 50 people came, we probably have a real encyclopedia, a real functioning website. Now, English and Hebrew are lucky that they have this, somebody which is a less lucky. So I am proud to define wikipedia. This is my suggestion. It's still a very hard definition. But for the purpose of this talk, I'm going to try to define wikipedia as something that meets writers, readers, stockware, current articles, future articles, and policies. Now let's take a look at each of these and you'll see very soon how this all ties back to languages and big languages and small languages and so on. So who are the writers? Who were the writers in the early history of wikipedia before 2006? A lot of these people were really close in one way or another. Not everybody, definitely, but quite a lot of these people were close to the free software world or free content, free licensing world. The world of, if you care about brands like Linux and Firefox and Apache and Pearl and PHP and technologies like that. And wikipedia kind of grew out of that world, at least in part. It was inspired by the world of the software and the concept of free content and free culture existed before wikipedia but wikipedia really made it big. So these were a lot of these people. So they were either from the free content side of things and the free licenses world or they were software developers. Again, not all of them, but quite a lot of them. The few of them were professional design researchers. The whole field of design research was quite young back then, 20 years ago. Not a lot of them were social engineers who understood well what small languages need. But what happened is that a lot of early wikipedia readers they were readers who by the time were bringing you what encyclopedias are and they used encyclopedias in their languages and they could use internet in their languages. So and we are speaking about only the biggest languages. So in 2000, the first Hebrew news website appeared in 2000. Before that, there was almost no content in the Hebrew language online. But again, when Wikipedia appeared, people was maybe usable, began to be usable. Russian was also quite usable. But a lot of languages were not there at all. A lot of languages of the former Soviet Union, for example. I was curious about them. Barely anything online 20 years ago. A lot of languages of India, but barely. So both the readers and the writers, they were quite far from the smaller and less advantageous languages. Now, what about features? What did Wikipedia have in the early days? It was very non-interactive. These days it's relatively more interactive and dynamic. But back then you could read an article, you could edit an article, you could have categories, very important thing. You could have top pages. And again, I'm going back to the sociology aspect of that. That's the habitus. That's how we define it by editing and reading articles. Now, that is a very frequently forgotten part. But some of the larger languages, not all, definitely, but some of the larger languages actually didn't quite everything from scratch. Some of the larger languages copied a lot of articles from other older encyclopedias. It was allowed by the license. It was probably the domain, Britannica or like all editions of Britannica, all editions of some Russian encyclopedias and so on. And they didn't start from complete zebra. There was something they could start from. And there's also their future articles. I checked several languages like that. And all the languages, definitely English, definitely Hebrew, definitely Russian, definitely Polish, they had some plans for the future, like list of articles that we should write. And they had different methods for doing this, but all of them had future articles. Like we're not talking now. We didn't complete the reading now. We need to write more. Which articles we should write? That's a complex question. But they had plans for the future. And they also have policies. Now, there were very few global policies because Wikipedia started just from English, and then the English Wikipedia wrote policies for itself. And then other languages wrote policies for themselves, some of which were similar to English, some of which were different, but essentially none of them were really global. So policies were usually local to each language. And then languages, you know, started coming up already in 2001, Catherine, German, Japanese, Russian, French and so on. And the first time I saw this in 2004, I thought, oh my God, this is cool. And I intentionally made a screenshot with the old monobook skin. That's what I saw. And as a level of languages, I said, hmm, that's a cool website for me to join. And initially I wrote only in English, later I started also writing in Hebrew and Russian, but seeing this, seeing this sort of languages really inspired me. This actually looked really cool. And I noticed like, okay, how did Wikipedia grow to so many languages? Not apparently. Some languages were added initially early because people who speak these languages asked to add them. Catherine, German, Russian, French, Japanese and so on, the famous stories. But then there was a certain point when the maintainers of the servers just added a lot of languages. They took a list of languages and they just read the sites in them. It was well-intentioned, but a lot of these languages never really grew. People didn't come there. Unfortunately, it's sad and we'll get to that a bit later. But that's how a lot of languages initially were created. So the thing is that there was demand by 2006, by 2006, I already was a pretty experienced Wikipedia. And Wikipedia, as you may remember, was growing big time in that year. People start interviewing Europeans and writing articles and newspapers about this wonderful website. And this brings us to 2006. This is a very cool language which I can absolutely not read at all. This is Thai. It has a very curious feature that the whole idea of what are words and sentences and punctuation is completely different from English. It's a nice technical challenge. Think about this a bit. So in 2006, things started to change in some several important ways for language support. First, let's again speak about features. There were a lot of new features coming in because there were a lot of new writers. So people who needed all kinds of tools, protection and division and things like that, built back then, and abuse filter. And all these things needed a lot of new user interface. And user interface, by then, was translated using patches. People had to write code and submit it to a source code repository. And the problem is that source code repositories are for developers, for programmers. They are for translators. Some translators are developers, but not everybody. So this is Jack Ogun, a famous actor. And back in the 90s, he made a very nice ad for Apple Macintosh. And based on that, I like talking about Translate Piki. So Translate Piki is localization in three easy steps. You need an English message and then step two, you translate into your language and step three is that there is no step three. As a translator, with Translate Piki, you didn't need to do anything. You just need to translate and that's it. All the technical work behind the scenes was done almost automatically. And again, we come back to habitus. It was localization in the wiki. So it was really inviting in Wikipedia to contribute translations to the user interface using another wiki. So the translations were deployed very quickly after they were made on Translate Piki. And then we come to okay, so how do we add new languages to the wiki media world? And so initially, apparently, I didn't notice it back then, but initially new languages were added on meta and then somebody voted email. And this is the email. Brian Weber, one of the best known developers of Wikipedia software, he wrote this email, a very short email that he's going to move new languages from meta to the incubator and this is it, this is the whole thing. This is how the incubator started one evening in 2006. And since then we have the incubator working pretty much the same way as it did in 2006. So in case you don't know in the incubator, all the languages are on the same side and every page has this funny code in the title which shows the language code. It's quite inconvenient, nevertheless that's what people use and once a language graduates from the incubator it gets its own domain. Now what does it mean that it graduates? The people who decide about graduation are called the language community. I'm a member of the language community. I was invited there at Kimania in 2010 and the language community as a group of volunteers, they just examine the incubator and the language is real and it has a language code in ISO 639 and it has a certain number of articles and it has a certain number of writers. These numbers are actually not precisely defined but the language community tries to do something sensible then it graduates and becomes a real language becomes a real domain and this brings us to the next part. This is written in a language that is spoken by the Wikimedia over here for this year. So let's see what happened in the next decade. So in the next decade Wikimedia foundation and also Wikimedia Germany started being much more serious about language support and developing it further. So Wikimedia foundation started the language engineering team. I was invited there also some people from some other people from the language community and from across the country were invited to join the team and initially we worked about keyboard support and font support and language selector also improving all kinds of things in the infrastructure of the Media and Wikimedia Internationalization. In 2012 the universal language selector really became a big thing it's a component for selecting a lot of languages from a long list and this also brings us to Wikidata. Now I was not directly involved in developing Wikidata but it's important to mention it because it's one of the most multilingual pieces of the Wikimedia world it was developed by Wikimedia Germany we collaborated with them a lot like they were one of the first users of the universal language selector and the important thing about Wikidata is that it's here across all the weekends so you put the data on Wikidata and then it can be shared everywhere and even though it's not Wikidext so we are like stepping away from the habitus we it's still a wiki it's a repository that anyone can edit then another thing that the language team did is deploying the transit extension on Wikimedia sites so it became possible to translate the all kinds of pages on especially on meta and media.org and Wikidata so not Wikipedia articles but all kinds of pages using the same interface as transit wiki then there was the TAPS project which made a big upgrade in how transit wiki works in case you don't know this is how the translation interface looks like so this is both for translating pages on meta and for translating user interface on transit wiki content translation is a major project it began in 2014 it is still going on it is still in development currently the team is busy with improving mobile support and adding more adding more machine translation engines but there were other talks about this Wikimedia so you're welcome to listen to them and this brings us closer to the current time to 2023 this is written in a language about which I haven't heard before 2020 what's the end of 2020 I heard about it for the first time and I started helping the people who were involved in writing this language get to Wikipedia good story it was successful please listen to the talk about this tomorrow about the TAP but basically we did all these nice things till now but what should we do next because evidently what we have is not perfect so again another disclaimer I'm trying to make it clear till now but all of these are just ideas these are not promises I believe that these are important things that we should do in the near future as soon as possible but it's not like they are official projects with the resources and developers and engineers and designers and all that I do hope that they will become so but at the moment these are proposals so templates templates are an important part of our editors some things about them are great some things about them are not so great the most important thing that is not so great about templates is that it's very hard to use one template to use a template from one language in another language this is one of the most frequently asked questions this is different from extensions wiki media wiki extensions visual editor is the same everywhere and math formulas are the same everywhere but info boxes are different so there are however a few good things about templates it's really easy to change templates editors can change templates at least on their own wiki and that immediately becomes usable so that's a great thing about templates the problem is that extensions are really hard to modify but they are very easy to deploy and they are very easy to internationalize and some people say that it's easy to copy templates from one side to another but this is wrong I tried it many times and this is really difficult it's really tragic because people often copy a template from one of the larger wikis to their own and they just get stuck to that with some old version so this is not actually easy so my priority number one for the future is going to be global templates making templates global and terrible across all wikis now I say that templates implement features and let me give you an example of some features so musical score you can use musical score this is a screenshot from a real wiki video article you can use musical score on any side you can use hieroglyphics on any side you can use map formulas on any side but here you can see two things you can see an info box and the map from open street map and the info box is a template which you cannot use on all the sites but the map you can use on every site which gets really complicated railway maps which are very common you also cannot use them on every site so and then footnotes and chess and all these things need to become global and then ending very soon I see that people are warning me we need to improve the wiki creation process currently creating wikis is extremely complicated the incubator needs to basically go away eventually we need to allow people to just create wikis just quickly shorten the process over google again I made a very detailed talk about this at the Celtic Knock conference so you're welcome to look that up and definitely do watch the presentation about the TIAP wikipedia tomorrow any languages in general the whole configuration process is currently extremely complicated it could be automated the last thing is future content now I mentioned in the beginning that all the big wikipedia in the early years even though it was early they already have plans for the future now we need a system a method to give people an easy way to plan what will they write in the future because it happens sometimes that wikis are created and then they never grow sometimes they do but often they don't because when they are growing initially in the incubator there's something that notches them to grow and to write articles you need to finish all those important articles and then we'll graduate and tell what so we need something I gave different names when I speak about this but it's all basically the same thing we need to give people a method to show some articles that are still missing in your wikipedia so this will give a direction this is not supposed to tell people write all the same things as in English wikipedia absolutely not, it's just a method it's just something that helps people to show what to do next how to grow, how to become better than your current state how to become better than other languages competition, some healthy competition is not so bad AI AI is there a bit I know that AI is a big topic now but it's really mostly supposed to be done by people it's really supposed to be done by people who know their cultures and know what is needed for their cultures and this really brings us to a real equity, to a real future when all the languages can just write what they need and without thinking about difficult technical problems and just go there and write and this is the end and I thank you a lot and I'm very easy to reach so please reach out to me anytime and thanks a lot and enjoy the rest of the wikipedia thank you Amir