 Good morning all. So we'll talk today on very interesting thing that is how to use our first language in our computers. I think I want to keep this session more interactive. So if you have any, if I ask something and you have answers, just feel free to shout. It's perfectly fine here. So first thing I would like here is that here how many people have first language as a English and others. Others does not have any first language or what please raise your hand. Yeah, again, how many people have first language as a English here? Okay, so very few very few. Okay, so you see out of I think we have around 20 to 30 people here and only four or five people have first language as a English. And this session is specifically for those people who who does not have their first language as English. As topic says, we'll talk on very basic things of internationalization. We have recently started federal globalization initiative and the name globalization in topic is specifically for that. What we'll cover in today's topic is that I decided we need to cover two things here. One is the why do we need internationalization or globalization. And then we'll see the basic stuff of the internationalization. And I have specifically selected this topic for upon a pack because when we see a pack. So if you see here this map, a pack countries, you will find majority of these countries are non English speaking. So their first language of the countries is non English there. So, for example, if you see, take example of India in India, the first we say Hindi is our national language. And still we have a 22 official languages in India. And each state has one major language there. Then if you say example of Nepal or just take example of China, so we should ideally take we call it CJK so China, Japanese and Korean, all all these have a different script and languages. If you take example of Indonesia, yeah, so Indonesia, they have Indonesian as a language. So, if you look carefully, you will find that internationalization is very important for the pack countries. And whatever product you are going to launch or whatever applications you are going to develop specifically targeting audience of these countries, it should be internationalize. That is first thing. Then often, often I get this question from few people that why isn't English is sufficient English is a global language. It is a business language and it definitely have lots of importance. Then why can't we only force English everywhere. So, do you have that question also in mind? Anyone things like that? Yeah, it's like from my after SSC onwards, I'm learning in English and it's almost 16 to 17 years. I'm using English and even if today someone asked me to write something in my mother language, local language, I find it difficult because I'm so much addicted to English. But it does not mean that we should forget our first language and we see how is that. So, if you see this chart here, you will find that this chart basically is for the first language speakers at least 50 millions worldwide. So, you will see here that English is comes here at third number in the world. You see Hindi is just for just below the English language. Now, one can definitely argue that no English might have more speakers than this. Yeah, English has more speakers than this. But that is because many people have English as their second language. You should note that point. Then if so many people have English as their second language, then why not just go ahead with the English? I have recently given one presentation that why globalization in that I have very clearly mentioned that why first language is very important for people. And some of the reason is that people are more emotional. Actually, there are lots of research has been done in this topic already. And outcome from that results was that people are more emotional when they listen something in their first language. Like if you are creating any brand that if that brand name is in the first language, the people get more attracted to that. People also learn things faster when it's in first language. And so that is why the first language is very important. And there is one more good point said by Sarah that is a different presentation and I don't want to cover that here. But it is saying that said that in that presentation is that if you want to kill any national culture, then kill its language. So if you want to preserve our culture, we should take care of our language and our language is very important. Now, the second reason behind we should not go for the English and force the English on the people is that if you see mobile phone growth, how many are you remember that Reliance launched mobile phone in 500? Around 2006 to 2007 and how badly it impacted the people's life, how it changed the people's life actually. Everyone even with very small earning started using mobile. Do you think what is another reason behind that so mass started using that mobile phone? Any other reason? Yeah, that can be one and any other? Yeah, cheap was the one. So I will I can tell you the answer is that mobile phones are language independent. So anyone can just pick up the phone and start talking. It was the small phone. You remember it's just keypad and some small screen. It was not like a smart phone. So it got widely adopted by people because it was language independent. And that is why if we want any technology to get adopted by masses, it should be language independent. That is very important. And do you think we can get same adoption for the smartphones? Now nowadays, smartphone are also cheaper. Like we can get good smart, good spec smartphone around 4000 as well. 4000 is around 30 30 USD something. 30 or it's around 30 to 50. Yeah. Do you think can we get same adoption for smartphone as well? It's has so many uses for people. It's changing people's life. Right. Everything is there. But it's not happening in that great strength because there is a language barrier is there. And few few carriers are already started thinking on that. I would specifically like like to mention here regarding the Mozilla Firefox OS. They are doing very good with the languages and whatever phone they are launching, they are excellently tested for the language. And in with everything is supported there. There was one more the start by the Google. They started Android one. And it was also targeting the masses and the language support was there. It's happening slowly. People are understanding the importance of language. And so that was the first part of where I wanted to mention you the need for languages. And in second topic, I will cover the basics of nuts and bolts of internationalization. Any questions till now? Okay. We will take it at the end of the session. That is good idea. So when I say nuts and bolts, just think from if you have a language and you want to represent that language in a computer medium. Then what can be the important things for that? Can any answers basics very important? So I will not wait for answers because it's big hall. Normally I like to come inside and just interact with all. So we'll just proceed with this thing. So when we want to introduce any language or we want to do a language computing, it gets stored in computer as a binary. Now, suppose you want to create some documents in your language other than English, I will say. And you created a document. It will get stored in your computer. And then if you send that document to some other people, what is the guarantee that he will get same output as you was viewing on your computer? What is the chance that right? And if you do that earlier, people were doing that they were using some arbitrary encodings like the is key is fork and they were communicating that through the males. And what outputs other people's getting was like this. Have you ever seen like this the question marks? Or any even not now, but earlier people were getting lots of these problems and people were asking to just go to Firefox view and convert it to the utility of it. So if you don't have encoding defined for your languages or anything, if you don't have a standardized storage, you cannot start the language computing. That is the basic and first building block for starting any internationalization stuff. Then if I say encoding, then what how to start? Then asking was the first encoding, which was standardized. And then when people started language computing, they use the upper ASCII of upper ASCII upper 8 bit and started mapping their languages in there. And early attempt from India was the is key. Same same same approach followed by the Japan as well. I think in the early early stuff. And even Tamil Tamil also designed one tisky tisky name. It's like Tamil inter information interchange code like that. But later on the Unicode pickup on that what Unicode did basically in it initial phases that they did analysis of what are the encodings available across the world. Just gathered those encoding and they provided a different code unique code point. I will say as when you say Unicode, it is unique code for each character. They they done that and they basically accommodate all the encodings in itself and Unicode has done excellent job because they are providing lots of storage spaces there. It's it's if you see the 21 bits of that it's like they have 16 bits 21 plans there. It's huge. It's huge. And even at present space we have almost around 50% space vacant is there. So now so whenever we start want to start encoding it storage should be there. And once you have a storage now you whatever your type it's getting saved and you will see something like this the binary codes. Will this work for you? Can you read this now now we have a storage right so if you say here it's not visible unfortunately from that that far. But just what I want to mention is that now we got a storage we can read this code it's getting stored in standard encoding. So can you view that can you read that can you share this with your people. No, definitely not. Might be we can get some species which can read these codes and start communicating. Some secret codes maybe something but it's not for the normal user. So to get to read this effectively we need to have something more visible and that's where the fonts come into picture. So when we have some fonts fonts is like we have a glyph shape and that shape is mapped with some hex code. So whenever you view that hex code in a rendering engine you will see that image from that font itself. So if you have now this is a raw code of some text article. I have I took it from some news newspaper and you will see when you apply proper font to this you will get this kind of output. Now you saw the two requirements of basic building block. One was the storage storage. It's supposed to be unique code. It is widely adopted and default standard. Then next come is the fonts to display those language. But if you look at carefully here. Do you think it's rendering properly? It's it should be here we are right. Yeah. Exactly. You see this is what what is happening right now is there is some hex code in background and when you apply font just the font shapes are getting selected and displayed to you as good as the Latin font you can say. Latin like a on 65 though it will show you a directly it is just showing that. So to get exact rendering out of this we will require some extra processing as as you know the complex scripts require lots of reordering like you can see here. The he revised actually first mother he we type it her after a mother and then it gets reordering right that kind of thing and plus if we type sometimes three four keys. And conjuncts get form. So for that kind of thing happens we require open type layout shapers. So here it comes the open type layout shaper. Now what does open type layout shaper does. Basically in your font you write some rules like when like if you see her here it's like work. Then half Halanta then here and then there is a proper open type layout shaper is there. You can see it gets properly half bother and it's happened for most of the combinations here. So open type layout shaper is very must and okay still still you are fine with the they're not the script when you see the nostalgic script it's more complex it's it's like it's comes vertical. Maybe I will add some screenshot when I will upload that for these kind of things open type layer shaper is must yesterday in UTR at a session. I think one student was asking me that he's working on implementing some Java applications for Indic languages and he's getting output like this. He's getting just character isolated characters. And the problem behind that is is there is open type layout shaper is missing. It is not missing in Java I will say but somehow he missed to call a proper API. So open type layout shaper is the next building block. And once you have open type layout shaper now you have storage you have font you have a shaper. Now now you can view the documents for your complex scripts. You can if you are rendering your browsing any website or if someone send you some document you can view that you can read that you can print that everything will work fine. But can you create some content in your own language? Definitely not with with this kind of software you will not able to create. And if you try to use your quality quality keyboard it's just ask you only now for creating content in your language you will require next thing is that is the keyboard layout. So if you see the keyboard layout for English it's it has only 26 alphabet. And if if you know that the quality is specifically designed for the English only. And it in single Dev Nagar script we have 128 characters. So can you can we accommodate those many characters on this layout? Definitely not. Can we go for a new keyboard? No by the way I would like to mention is that there is one patent specifically for Indic language keyboards. I if you search that the person has developed a big actual hardware keyboard and he mapped all character there. So it's it's huge keyboard and it's patented by the it's patented stuff. But to accommodate with what we have we require. Okay, in Dev Nagar we have only 120 alphabet. But when you go for Chinese Japanese and Korean languages there are thousands of characters are there. There is like a symbol for each activity. So it's more than thousands. It's huge. So how can you accommodate those many things on your the quality keyboards. So there comes into picture is that input method framework. So what input method frameworks does it act as a driver in between your hardware keyboard to your application. And you can develop some specific input method for these languages and then it will start working. So we'll just see the basic building block right now. Then I will explain you that how many things are available in Fedora for now. Now the will require definitely a specific keyboard layouts for handling the complex strips requirement. Then we got the keyboard as well now. Then there remains one more thing can we do with the now each country has a different date formats. Even if we say the days we call in English we call it Monday to Sunday. But in say Marathi language it's like somewhere to river. Even if you say the currency symbol currency symbol is different across the countries. So if you are developing an application and you mentioned that field as a currency it should pick up a proper currency symbol. If you do not have a proper local selected you will get dollar there because whatever local you are it will pick up currency symbol from there. And then you need to manually do some hard coding with that and put there is that if you remember before I think two to three years back. If I'm correct two to three years back we get we got our I and R symbol currency symbol that is a two zero B nine that is our currency symbol. And I would like to mention here one point is that if you are installing any any software any operating system you must select a proper local while installing that. Installing Ian US in India will not help you lot because there is one specific example here is that if you are installing Fedora and if you choose Ian underscore iron local automatically it will select keyboard layout which has a currency symbol at standardized position. But if you select while installation Ian US you will not get currency symbol and then again you will need to go to setting and select proper keyboard layout. So if you are installing anything you must select a proper country local then automatic most of the things get set up for your system. Even even I think yesterday we were mentioning that we do not get the packages for languages like the spell checker is the one thing the I am is one thing. If you select proper local while installation say if I select Marathi India local while installation I will get most of the things ready and set up when I will first login into my system. So I will request you all to don't just do Ian underscore US local while installing stuff or software. Please make sure that if you are in India or in any other country select respective local for that. Locals are very important. Even each country has a different sorting stuff like if you see in India we have definitely we have lots of languages but even in Latin there are lots of differences. Like few people sort with caps first few people sort with the small first even even even even in Latin also also many languages have different character extra character there. So sorting rules that is why local is very important stuff and if you don't know the Unicode has doing a great job for locals. It is a Unicode common local data repository and they have local for almost all all world's local is there. So we have covered a basic building blocks till now. If you see what we covered we covered the first is a storage without storage we can't go ahead anyway. Then we want a font once the font is done. Now now here I'm mentioning these things like the storage fonts rendering engine but each of these things has lots of if you dive in depth it's have lots of things there. Like in fact the font is complete domain where you can complete your graduation there you can do a PhD into the fonts you can do PhD into in the rendering stuff. It's vast it's huge here I'm just targeting for the basic stuff and as a student you can definitely try to dive deep into each of this concept. So storage fonts rendering engine then we have been it the keyboard then we want local that's are the most important blocks for starting internationalization for any language. Then then what do we have in Fedora for these things. So we just covered the basic building blocks for internationalization. Now we will see what are the things we have already in Fedora in place and how how I will say how can one contribute as well. So if you say now we are already supporting Unicode 7.0 the latest version of Fedora that is Fedora 22 we have added support for Unicode 7.0 and whatever character there is Unicode 7.0 you will get a chance to use that in your application. Now for Fedora 23 we are trying for Unicode 8.0 support. Now if you see Unicode 8.0 it has around 7000 plus characters are there and we are not only trying to get in specifically say glibc but we want this in lib ICU as well. So there are many components in Fedora itself which requires updates for this particular Unicode 8.0 version. So if anybody is interested in this please feel free to come up and start working on this task. It's simple but it's good to start with the simple stuff. We are planning that change proposal for Fedora 23. Then next in Fedora we already have fonts for Unicode basic multilingual pen. So here basic multilingual pen that is a BMP might be jargon for most of you but this is the pen that is actively used in almost all the operating system. And this pen has a storage facility for all active languages in the world. It is huge pen but it is now almost getting full as well. So in Fedora we have fonts for all this Unicode big range. So it will happen very rarely that you are browsing some language content and it will not get displayed properly to you. That is a very good achievement as an operating system. We should have a default font for almost mostly use encodings. Then the HubBusNG layout shaper. As I mentioned earlier that without open type layout shaper we cannot get the proper rendering for the system. And HubBusNG this we started around 2008 to 2009 we were discussing on this stuff. And finally couple of years back this got completed by the Bahadur and it is now supported in all the applications. The LibreOffice if you see the GNOME it is already there HubBusNG is there. So you will get perfect rendering for complex scripts. Then comes the input method that is a keyboard. We have M17N. So here 17 stands for the multilingual like this. That is M17N it supports all the Indic layouts. Then we have a specific input engines for the CJK that is Chinese, Japanese and Korean. We have lots of input methods for that. And in Fedora 23 we are trying to switch from switch Lib Zwing as a default for Chinese. And on the top of that we have lots of we have a predictive text input method that supports almost all languages. Those are available in the Huntsville dictionary. I am thinking to give a barcam session on predictive text input method. Because you might be very habitual to see the predictions on your mobile phone. But do you think having prediction in your laptop will be useful as well? Yeah. So we already have predictive text input method in the Fedora and maybe we will have one barcam session on that. Its name is iBus Striping Booster again. Then when we say locale, so we have 327 plus locale in glibc. And I think I have less time. So I got indication that only 5 minutes left earlier. So I am open for questions now. So do you have any questions? That's good. No question is good. But anyway, we are doing lots of stuff in openly. So we have hash fedora-globalization IRC. There you can ask your question. And we are having weekly meeting for fedora-globalization. You can definitely participate into that and see what is happening and where you can participate into that. There are lots of open tasks presently in the fedora-globalization. We are planning to have some sprints, development sprints. Maybe you can participate into that. We have G11 at the red list. Here you can post. So best thing in fedora is that everything happens openly. If you are interested in anything, you can simply search with the fedora and you can understand. You can learn. You can contribute and you can grow as a developer. That is the best part of fedora. So with each session, you will get something like this. And it's your routine now to just jump in and start development on this. And with this, I would like to say thank you.