 Good morning everyone. Welcome to my talk on internationalization. My lovely assistant Coconut, who is also an internationalization expert, she may or may not make an appearance. We'll see how she feels. But let's get talking about internationalization. So quick note before I start. The topic of internationalization is super huge. There's so much I could talk about and I only have 30 minutes. So I'm gonna go kind of quick through a lot of different topics. I recommend, if you want to follow along, to grab those slides. You can find them on my Twitter or my personal website. My Twitter handle is just my first and last name, which is Robin Daikiman. And I encourage you to look up things after this talk because I don't have a lot of time to go into great depth about this, but I encourage you to start researching things on your own after this talk. So as I mentioned, my name is Robin. I am a UI engineer at a company called Talia in Austin, Texas. And my journey to want to learn more about internationalization comes from the fact that prior to moving to Austin, I actually lived in Asia for six years. So I lived in Taipei, in Suzhou, and in Shanghai. Prior to that, you know, I had kind of taken for granted how most of the web was built for people like me, Americans, English speakers. And so when I moved to China and Taiwan, I found that oftentimes I would go to order food or purchase something online, and I get really frustrated or really confused or sometimes like the translations would be silly or inappropriate. And it kind of got me thinking about how we should internationalize our applications in the US for non-English speakers and for people who aren't Americans. And as I was learning more and more about this, there were just so many things that I didn't know that I didn't know. And so I wanted to share those things with you so that you can make your applications so that your users don't feel this frustration and this confusion when they're using your application. So what I wanted to cover is the following. So first of all, what is internationalization? A lot of people, myself included when I first started learning about this, confused internationalization with localization. Or some people think, well, my application is only going to ever be in English or maybe it's only going to be used by Americans. So I don't need to worry about that. Not sure. There's still some things you have to think about, even if your app is only in English or even if your app is only going to be used in the United States. So we'll talk about how is internationalization different from localization and why is internationalization important. Then we're going to kind of dive into five different categories of internationalization issues that might arise. So I've kind of divided it into the following categories. First of all, translations, dates, times and numbers, non Latin characters and validations, designing with only English speakers in mind and then RTL support, which stands for right to left language support. So let's just start off with what is internationalization? So there are three terms that are going to come up a lot when researching the topic of internationalization, which are internationalization, localization and globalization. And just a quick note, if you ever see them shortened like I 18n, those are called numerons in the same way that we often shorten accessibility to a 11y or ally. It's the same thing. So it's just a shortened version of that. So these three terms, let's talk about what they mean and how they're different. So here's a chart here that I have with internationalization and localization. So we're going to focus on internationalization and as developers, most of what we're doing is going to be on the internationalization side of things, not the localization side of things. And what I mean by that is internationalization is the idea of not hard coding something to a specific language or a specific region. And the easiest example I can think of is when we're talking about dates. Now, in the US, we format our dates as month, day, year. Now, if you're in another country, you might expect that date to be in day, month, year, instead of month, day, year. And so that could be really confusing. So if we're building our application, we don't want to hard code that date to be always in the American format. We want to make sure that we are making a function that will spit out the date depending on the location or depending on the user's preference. So internationalization is a process of making sure that all those aspects, and it's far more than just dates or far more than just adding a language tag at the top of your HTML. And we'll go over what that means. But internationalization is the process of making sure that everything is not hard coded to a specific language or region. Now, hopefully we do this when we first start developing our application. As we build new components, we make sure we're building them with internationalization in mind. And therefore, we only have to do that once. Now, we might start with just English. And then eventually, our product manager or someone comes up to us and says, we want to also add Spanish. And so we localize into Spanish. Or we localize into Russian or Spanish from Latin America versus Spanish from Spain. So we can localize multiple times. And ideally, we've done such a good job at internationalizing that that localization process goes very smoothly. Now, unless you're fluent in that language, you probably won't have too much to do with the actual localization part of things. So our job as developers is the internationalization part of things. This entire process, internationalization and localization together is called globalization. If that still doesn't make sense, here's one more visual. Internationalization is everything under the surface. And localization is what you see. So again, going back to that example with the dates, we're going to actually see the final date. If we are an American, and we see that date formatted as month day year, we're probably never going to see it formatted in another way. We're not going to see everything that goes under the surface to make sure that date is formatted in the correct way. So that is the what is internationalization and how it's different from localization. Again, we'll be talking about internationalization in this presentation. Let's talk about the why. So I mentioned earlier that I was pretty lucky that because I'm American and because I'm a native English speaker, most of the web was built for people like me. And here's some stats to back that up. So on the left, we have languages found on the web, which probably surprises nobody that most of it is found in English over 50%. Now, if you look on the right, it says internet users by language. So on the right, we have still English is the top, but we have 50% of the web is found in English and yet only 25% of internet users are speaking in English. So there's this discrepancy in terms of how much is available in English and how many people actually speak English. And you can even look at Chinese. So you can see Chinese on the right on the right graph. It's the second most common or sorry, the second largest group of internet users. And yet they're not even in the top 10 of languages found on the web. So from a financial standpoint, it makes a lot of sense to be able to translate into these languages because these non English users might not find similar applications in their own language. Now, you might think to yourself, well, my applications only in the US, so I don't really need to worry about this. And actually, in the US, 21% of people speak a language other than English, and 8.5 speak English less than very well. So you're missing out on a huge user base if you're only making your app available in English. And furthermore, I've been talking about this from a financial standpoint, but I think it's really important from a moral standpoint as well. So we often talk about accessibility in terms of people with visual impairments, or people who can't use a mouse and therefore we're making our websites, keyboard accessible. Internationalization is also part of that umbrella accessibility term. I personally believe that the internet is a human right, and that everyone deserves to have access to the web. And it's not really fair that just because someone doesn't speak English or just because someone is an American, that they don't have that access. And so we as developers have a moral duty to make sure that we are making our websites accessible, and including internationalization under that accessibility umbrella term. So let's go into those five categories that I was talking about about common internationalization pitfalls and how to avoid them. So how I will structure this is the following. I will give you issues that might come up while you're developing your application, and then I will tell you how to solve them at the end. So it will be like problem solution to the next category problem solution. So first problem is translations. So I told you at the beginning that the very definition of internationalizing your application is to not hard code something to a language or region. So obviously the first thing that comes to mind when talking about translations is not hard coding text. So when you're building a personal project or maybe something super small, you often will just type your English words right into your HTML, which is fine, like with small applications, but we just want to keep in mind that we shouldn't be doing that for larger scale applications. And what I mean by hard coding text is just something as simple as this just having the English right in the HTML. This is going to be really difficult if in the future we want to go and translate that into Spanish. We're going to have to rip out all the English. It's going to be a huge hassle. So what I mean by not hard coding the text is the following. So we might have a function called T and I'll talk about internationalization frameworks that will give you this functionality later on. But just to give you an example of what I mean by not hard coded text is the following. So having a function that goes to the location and pulls out the text, even if it's all English texts at the moment, at least you have that stored in the future, you would be able to translate into different languages. The next issue that comes up a lot is getting the correct translations. And again, as someone who lived in China, I saw this all the time, Google Translate is not going to cut it for a professional application, because a machine cannot understand context. And it's often going to give you confusing or incorrect or inappropriate text. So here's some examples. I'm sure you all have seen things similar to this. Seeing a sign that says be aware of safety, which by the way, this is everywhere. If you go hiking, these signs are everywhere in China. Something like child shredded meat. Eat your fingers off. Or use the queen to invite powerful water. Now I'm sure all those people who made the signs did not actually mean that exact sentence. So we want to avoid relying on machine translations. The next issue that is going to come up a lot is plurals. Now in English, the rules of pluralization are the following. You have one of something you don't have an S. If you have zero or two or more of something you have an S and I chose cat just because I want an excuse to put my cookies into the slides. It's pepper and coconut. So we've got two forms of plurals in English. Now in Chinese, it's the same regardless of how many you have. If you have zero, it's Mao, that's one, it's Mao, it's two, it's Mao, it's all the same. Now, other languages have even more than two plurals. Polish, for example, has three. And I'm not going to try to pronounce this. And confession, I did the exact thing I told you not to do last slide and I pulled this from Google Translate. So if any Polish people are watching this and that's wrong, I apologize. But the point is that you don't know the pluralization rules of every single language that you ever might translate into. So this can be something that you really need to be cognizant of. Also, more cat photos, some languages have gender nouns. So in English, we don't have this, but for example, in Spanish, there is a masculine form and a feminine form. And so cat fits a boy cat, you would have an O at the end. And if it's a girl cat, you would have an A at the end and all the nouns either have a masculine or feminine form. In some languages, such as German, actually also have a neutral noun. So these are again, things that we don't have in English, but you need to think about if you're going to translate into another language. So I've given you a bunch of problems. What is the solution? So first of all, obviously, don't hard code your text. So if there's even like a remote possibility that you might want to add another language, it'll be so much easier if you just store your translations elsewhere and don't hard code anything. So it's gonna be really hard to find that. So just keep that in mind when you're starting an application. Hire a proper translation service. So don't rely on Google Translate or other AI to provide your translations. It'll be fine if you've had like short and common phrases, say submit, enter, you can actually look for like a checkmark, a verified checkmark to see if it's a good translation, but anything that's like a sentence just you need a translation service. And you might be wondering, okay, if I'm not hard coding my text, how am I going to pull out this text? And the solution most people use is an I-18N or internationalization framework that will help you to pull in the correct text based on the language that the user has selected. And it can also help you with those things like pluralization and gender roles. So here's some internationalization frameworks or libraries. And I tried to keep this like not framework specific. So these are just JavaScript ones, not like React specific ones or Angular specific ones. If you have a smaller application, you could use something like polyglot. If you have a larger application and need something that's heavier lifting, you could use I-18next. That's actually what I use at my company. We use React I-18next. And here's some other libraries if those two aren't doing it for you. For React I-18next, this is how we set up plural. So here's just an example. So at my company, we support 18 different locales. So I've listed them here in the chart. Most of them have two plural versions, but Japanese, Korean and Chinese just have one version. Czech and Polish have three and Slovenian has four. And so the internationalization library just expects the translations to look something like this. So in my translation file, like say, I'm just storing it in a JSON file, I would just store it as cat, cat plural. And then I pass in the count. And then the framework does the lifting for me. The framework figures out how many cats. Speaking of, there's pepper again. Finally, just a note, don't make assumptions on grammar. Grammatical structure is not the same in all languages. Especially the further away from English you get. So like Spanish and French and Italian, they all have like similar roots. But if you're working with something like Korean or Arabic, there's probably not going to be a ton of similarities between. So don't concatenate your strings. Again, it might work in English and Spanish, but that doesn't mean it's going to work in all languages. So just be aware of that. Okay, the next problem. So after translation, the thing that comes up a lot, I think, like dates and times, calendars, time zones, numbers, all that. And now we're kind of getting into the point of, yeah, okay, the translation thing, if you're only going to use English, you probably don't have to worry about your translations. But this is really where we're going to start talking about issues that come up for people who only have English only apps. So first of all, I mentioned this earlier. If I showed this date to a group of Americans, 01, 02, 2020, most Americans would probably say that this is January 2. If I showed this to a group of British people, they would probably say, This is February 1. So here's your your first huge issue. This day can be confusing, even though these people speak the same language. Another thing in the US, we generally prefer 12 hour clocks, so 3pm or 3am. In a country like Germany, they might prefer a 24 hour style clock. Time zones are super frustrating. And if you've ever worked with time zones, I'm sure you feel white shoots pain here. So time zones are super tricky, not only just doing the math between time zones, but also some time zones start at the 30 minute mark. And some time zones observe daylight savings time. Some don't. The ones that do, they don't change their clocks on the same date. And sometimes one country, such as the US, not all of the country observes daylight savings time. So it can be super confusing, something that can cause a lot of headaches. Another thing to think about is calendars. So in the US, we generally prefer that our calendars start on a Sunday. In the UK, they generally prefer that their calendars start on a Monday. Numbers are another thing. So if I gave this number to a group of Americans and said, Hey, can you format this with commas and decimal points? They would probably format it like so 123 comma 456 decimal point 78. I gave it to a group of people from India. They might format it one comma 23 comma 456 decimal point 78. And then in Spain, they swap the decimal point in the comma from how we're used to it in the US. So 123 decimal point 456 comma 78. So this can be super confusing again, if you are expecting the number in one format and you receive it in another. Another thing is currency. We want to make sure that we're showing currency that makes sense to the user. And also keep in mind that the word dollars could be US dollars could be Canadian dollars could be Australian dollars, lots of people use the word dollars. So just make sure to be clear on your currency and make sure that it makes sense for your user. So what is the solution to all this? Well, first of all, don't try to solve this yourself. This is way too hard for one person or one team to work on. Do not reinvent the wheel. There are so many open source libraries that you can choose from. Moment.js is probably the most notoriously known daytime library. However, it does get a lot of flak because of its bundle size. So there are alternatives. And actually, as I was researching for this presentation, I found a github repo called you don't need Moment.js. And it compares all the different libraries that you can use. So keep in mind, some of them don't support time zones. So if you need that functionality, you probably do need a bigger library. But yeah, there are alternatives if you don't want to use Moment.js. Another thing you could use, and this is something that was new to me as I was researching for this is the intel object. So this is something you don't need to download a library. It is a native JavaScript object that you can open up your dev tools and use it right now. And it gives you a lot of things like it gives you relative time format, list format, number format, plurals. And when I say plurals, I don't mean like what we talked about in the translation category. I mean like words like few or many and it gives you those in different languages. Here's an example. So if I wanted the relative time format for ES-MX, which means Spanish from Mexico, and I wanted minus one month. So one month ago, it would give me Austin and Mays back. If I wanted to format a number, and I tell it JA for Japanese, I want it in the style currency, and I give it the currency that I wanted to format, and then I pass in the integer, it will format that number for me. So has a lot of great options for something that you don't need to download. Finally, just some last tips, take consistent, keep your communication open between the front end and the back end, make sure that you're sending everything in the same format. If you can store your dates in UTC, which is coordinated universal time, it's not a time zone, it's just a standard that we've all agreed upon is zero. And store your numbers as integers, there's no reason to store them formatted, just let whatever library you're using format it for you. And if all else fails, call in Ryan Gosling and have him give you pick up lines like, Hey girl, did you lose a timestamp? Because I'm pretty sure it's date time dot now. I know those are super old. I don't care. So third problem, non Latin characters invalidation, I've kind of pushed these together because they're somewhat related. So first of all, if you've ever seen characters like this, maybe in the internet in the 90s, or you've received an email with like funky characters, usually means that they haven't encoded their characters properly, and they have some non Latin characters in there. This is an example I created, it's a fake example. But we don't ever want to have a form and have someone put their name in Zoey with the two dots over the E and tell her that her name is invalid. And you might think to yourself, Yeah, that's silly, I would never do that. No respectable company would ever do that. But actually, just this month, the US government did it. So yeah, that was my silly example. This is a real life example where somebody tried to put in their last name, which had an accent over the E. And the US government said, Hey, your name is invalid. Please don't do this. It can be very harmful and really offensive to people to tell them that their name is invalid. So again, even if your application is only used in the US, and it's only in English, you still need to think about internationalization issues such as this, this should not happen, especially on the US government's website. So what is the solution to this? Obviously, encode your your characters. There's a few ways you can do this. You can set it in your HTML, set the char set to utf-8, use Unicode. So Unicode's great because not only does it have the alphabet A through Z, axon marks, and mounts, everything. It's got Chinese characters, Cyrillic alphabet, Arabic characters, emoji. So your emojis are properly encoded. It's important. Be aware when you're sorting using non Latin characters, if you're writing like a sorting algorithm by hand, just be aware of that. Be careful with your routing. So if you have dynamic routing, and you're supporting, do you supporting Chinese and you pull in the route of a Chinese character and then in the URL, you have a Chinese character actually that will work. But that's not the standard. The standard is to use just the A through Z alphabet. Be careful when you're choosing your fonts. Just because a font looks good in English and Spanish and French Italian does not mean that it's going to look good in Russian, or Chinese or Japanese. And then talking about validating, be aware of what your regex is checking for. It could be a silly example, or it could be something actually offensive as in the example I showed you before. So we don't want to tell Zoe that her name is invalid. Keep in mind that some languages, so this is Chinese, one character can have meaning in some languages. So don't tell someone that their last name is too short. And another no on last names. People can have more than one last name. So make sure to check for that. Maybe you have two inputs, or maybe you just make sure that your input doesn't tell someone that because they have two words in there that it's invalid. You don't want to do that. The fourth problem, designing with only English speakers in mind. So this is more of the UX perspective, but still good to be aware of as developers. Now, this is like a really simple example, but I promise you it happens all the time. So say I make a button, and it's safe and it's this big. And I hard code it because I know that this is how big save is. And then two years later, we decide to translate into Spanish. And suddenly, it does not fit in the button. Again, this happens all the time. Here's an example of the width of an English word compared to other languages. So you see that English is actually one of the narrower in terms of like the width of the word. Korean Chinese, those those languages often are around the same size or sometimes shorter. But when you go into languages like Portuguese, French, German, they're oftentimes much longer. And for some reason, it's always German that has really long words. I actually asked some German friends to give me some like really long word examples. So this is what they gave me. Yeah, so here's the English word and the German word. So you need to make sure that you are giving enough space for these really long German words. So what is the solution? Allow for enough flexibility and space that translated text fix into your design. So rule of thumb is that if the English is fewer than 10 characters, leave room for three times that expansion. And if it's greater than 10 characters leave room for 30% that expansion. And then for us as developers, we were coding that CSS, make sure to not use fixed widths, try to keep your CSS flexible. So that things grow and shrink accordingly. Alright, our last thing that might come up is supporting right to left languages. Now here is a list of some right to left languages. So as English speakers, we're very familiar with left to right languages. And a note here, some languages like Chinese traditionally go up to down, but on the web they go left to right. So you're really only worrying about left to right and right to left. If you have never seen a right to left website here is an example. So here is a left to right language English. And you see the Wikipedia logos on the left, the pictures of the cats are on the right, the inputs on the right. When I look at the Arabic version, everything is swapped. So we've got the logo on the on the right now we have the input on the left and the photos on the left. So you can see it just goes just swaps back and forth, which is pretty cool if you've never seen that before. And how do you achieve this? Well, first of all, you need to add dirt equals RTL to HTML. You've never seen that before. It's because the default is LTR left to right. But we need to specify if it's going to be right to left. What you're going to have issues with is if you're using things like float left, float right, text to line left, text line right, because that's going to be right in both left to right and right to left. It's going to be really confusing. What you can use instead is something like Flexbox. Flexbox will if it's flex star, it will be on the left from a left to right language. And if it's a right to left language, it will be on the right. So it observes the directionality, unlike floats and text lines. And just even if you're not going to support right to left languages, something that I think is a good practice is to name things based on what they do and not what they look like, because it doesn't always make sense on the directionality. So here's an example. Let's say I've got two pagination arrows. If I was just thinking left to right, I might call them left arrow and right arrow. But that doesn't make sense in a right to left language. What does make sense in both is previous and next. So if I call it that, it works in both and previous and next, like that's what it actually does, not what it looks like. So try to use words like before, start beginning backwards above, etc. So in conclusion, these are the final takeaways. Design your app to be language, region and culture independent. Use open source projects to help you solve problems that you didn't even know existed. And just keep in mind that globalizing means bigger markets, more inclusiveness and better code that is extendable and easier to manage. The too long didn't lesson is to be like pitbull, Mr. Worldwide. So that is all I have for you today. Again, if you want a copy of these slides, I will be posting them on my Twitter, or my website. And I do have some resources for you. If you want to look into more of the topics, I know I didn't jump around to a lot of things. I've got an IT and checker for you. And yeah, we've got some information. So thank you all for listening. And I hope you enjoy the rest of the conference. And I'm sorry, my cats were running around like crazy during this talk. Bye