 Good afternoon. I hope you have a very good DrupalCon in Prague. This is my first DrupalCon talk here, and I'm really honored to be here. I'm going to talk about things around how to localize user experience, especially the experience in the CJK language. We have many tools to help with the translation in Drupal, but understanding how to localize the user experience can make the website even better. Later in the presentation, I will address more about the CJK. I hope you won't explore your mind, but if you don't know CJK yet, let's embrace some cultural shock. If you are working on some kind of a website with CJK, I hope this will be also helpful. I'm a Taiwanese Drupal developer and made a website in traditional Chinese, Japanese and English for the last years in my adventure in Drupal. I work in the Japanese Research Institute, OIST, which is used English and Japanese as equally important first language, origin always makes sure to bring the same experience to our audience with the same experience in both English and Japanese. The OIST is a public founded by the Japanese government directly from the Cabinet Office. It is an interdisciplinary research institute. If you are interested in the institute and want to learn more about our project with our new website, I think we can watch the recording from the talk they gave by Michael Cooper on Tuesday. So let's talk about the translation first. We all know Drupal has the well-developed translation system that enables us to make the website well-translated. We can translate the node with the field translation. We can translate many strings appears in the UI. Most of the time we just mark the tweak or the PHP with the T function or filter. Sometimes we can also change the dynamic variation in the T function as well. And we also need the context to be able to translate things into different locales. For example, in Japanese we translate contact into two different words as an action or as a section label. So it's a different meaning. So we need the context that the Drupal system help us really well. And we know how to make the multi-language website on Drupal already. Is that enough? Of course, no. So that's why my topic is coming for this. And there was a Chinese newspaper editor, translator, and writer in Qin Dynasty. He was the most famous translator. He's most famous for introducing the Western idea including the Darwin's natural selection to China in the late 19th century. He addressed the three difficulties in the translation, Xin, Da, Ya, these three characters. And that is the ideal model in translation and it influenced the translation a lot in the publication. An ideal translation to Chinese or Japanese that follows the three principles could make the lens or translation really, really different, sometimes shorter, sometimes much longer from English or to English. So takes just one word as the example translate. It could be different lens, but in Chinese and Japanese you can see the first character is the same, but they are still a little bit different. And last expanded to a little, and this is the Drupal UI sentence we often see in the login UI, but how much difference in the nuance and the lens now? So the feel is not in the English origin text, but we can see that in the Deutsch and also see that in Chinese and also Japanese at least. So different language needs different way to express the idea. Some study made rough estimation about how the text could be expanded or contract after the translation. So the best way is usually to accommodate the situation is to translate the page and while you are doing the design. So in our new website design we try to select a couple of crucial important pages to do the translation at the very beginning and we use the text into the design and see the designer will view where should we break the line and how much text do we input into the page. And sometimes the translator also help us to reduce the amount of text in order to accommodate to the design. So the longer the original text it is, the more it needs to be more taken care of. So for example here there's more high density sections and the Japanese we want to keep the same rhythm and the same density, but actually the Japanese text characters are less than the English text characters. So the first challenge in our mostly in my experience doing the Japanese website, one thing is the date. So when talking about the date, we know there are translations for weekdays, for months, but they are just translations. What else can affect the localization? Most of us know this Gregorian calendar, this is the most widely used calendar since 1582 by announced by Pope Gregory VIII. Remember do we all interpret the date of the Gregorian calendar in the same way? If you see December, can you raise your hand if you see December? Okay, so I'm sure I'm in Europe because Europe used to use the date and month and the year, but it's rare in the United States. And in our institute we have members from all over the world and we have to make sure we deliver the same message and we keep all the user communicate on the same page. So in the end we choose to use the ISO format, which is coincidentally the same order that we need in Japan and also in Taiwan we use the same format. But is that all for date? Not yet. I don't know what can you see from this calendar. We can see year, we can see weekdays, we can see March on the top left. This is a Thai calendar, but why the year is 500 years more? And this is just one page of a sample calendar in Taiwan and there are much more information in one day. We see English day, we see the translation on the bottom right. We see 2023 and June on the top right. And there is also Lunar calendar, I heard some people see this 112. And this is the year we use in Taiwan. What's that? So there are actually different calendar year in the world, which is still official. The Lunar calendar is not official, but we use it for many holidays as well. But in the official calendar we use that for the government documents applications. So sometimes we need to display this kind of a date in the website. And Japanese is, most of them we can just calculate because the Thailand they use the year from, it is called Buddhist year, Buddhist era, the birth of the Buddhist. And in Taiwan it is the birth year of Taiwan since 1911. And Japanese that's only like four years ago, now it's Lewa, the fourth year. And what Japanese year can be more complicated. So this is a document that I last week I just traveled in the Spanish synagogue in Prague, Ohtang. And I found this very sad and complicated information on this very small permission. And but actually this is a permission issued by the Japanese government during the war from the Chinese city in Shanghai. And the date, the validation date for this document is from Showa 18th. Showa is still very common. It is the second previous imperial year which is still used in the official document right now, for example my residence car in Japan. And many of the handbook in Japan we also have the table, you can find out which AD year it equals which imperial year in Japan. So these three are still very common used in Japan. So how do we calculate this? How do we display them? It makes the calculation more conditional, right? But how do we do it? So do we write the switch case for the code? Or there is actually a smarter way to use a unique code project called CODR. The CODR, actually you just put the date in and the CODR, the common locale data repository has a lot of information for calendar for unit in different language. So it helps us to process for example this data issue. I think that's very smart and I think that's fantastic. All right, next one about CJK. So there are some exclusive issues in CJK. And it's very easy to type a lot in alphabet in computer with just keyboard. But how does it work when you type in CJK? So this is a autocomplete search in our new website and this is a kind of a bad example. So when we type initially for corona that corona equals to the COVID-19 words and when we type every time you see there, it pops different autocomplete search, which is not necessary. And in the last, you can see we choose a different set of words when we do the search. So input Chinese or Japanese consists of one to four different phonetic elements. Sample pronunciation could have different words combinations. So the main issue here is the autocomplete triggers the search too early before the words has been chosen or the type is finished. So when type the word in the computer, the keystroke is also very different from different languages. When typing the CJK text, we would like to wait until the word is finally entered. And this can be done using a special JavaScript event. So if we implement it, we can see these. So when we type in English, it works still the same. Every stroke triggers the search. But when we type Japanese, until the underscore bar disappear, which means the type is ended, it will trigger, only trigger the search once. So the magic is actually the composition event. So we can treat the star and the composition update as the typing process is still ongoing. And then at the end, when it's ended, we receive the composition end. We can start doing the eject call or whatever we want to do in this program. Okay. The last part would be a little bit complicated, but I promise this is not a language class. Okay. So in English, we sometimes sort with alphabets, usually symbol, number, alphabets. And how does CJK do in a similar way? CJK, Japanese, they all use different ways. So let me start it with the Japanese. In Japanese, the order of these, the 50 sounds gojiu on is the main ordering system in Japan we will use. They represent all the 52, actually 52 or three, yeah, 52 phonetic elements. And there are two sets. The left one is hiragana, the right one is katagana. And katagana is usually used for translation words or sometimes some newly invented terms. There are also Chinese character in Japanese, but in Japanese it's called kanji. In Korean it's called hanji. And then in the other Mandarin we call just, yeah. I somebody forgot what my language is. Well, if we have a group of words in Japanese on the left side, so what's the ideal order looks like? So on the right one is the same sample and the order with this order. So first it usually started with the symbol, numbers, alphabets in English. And then hiragana and then katagana. And then the final part is the kanji. But hiragana, katagana, we already see the table. So there's an order from row and then the column. And what do we do with the kanji? So actually in Japanese every single kanji also you can spell that with the katagana or hiragana. If we use the sword, for example in the JavaScript directly, we can only get the order from the index of uni code, which is not, doesn't sound anything for the Japanese people because they couldn't understand what the order it is. So instead it's actually taking the first character and use their pronunciation and order by the pronunciation. So starting with the first character, if the first character is the same and then the second character will be compared. So this is using locale compare in the JavaScript to make this order. But you can see right in the center, one, half and two, that one. The others are in the incremental order, but this one is not. Because the Japanese character can have more than one pronunciation. So it was actually considered as the other pronunciation. All right, different pronunciation I will hold it a little bit and I jump to Chinese here. So this is the character that we use for Jie Ke. That's the Czech Republican name, the first character. And there are many different ways we can attribute one character. There are radical strokes, there are total strokes. And we can also spell the word with the phonetic characters we use in Taiwan. Or in China they use pin-in. So there are three different for these characters, three different pronunciations. We present music, happy, happiness, or to appreciate as a verb. In Japanese there are more. So because they inherited a lot of different pronunciations also from the ancient Chinese era. So they have much more. There are 580 characters in Chinese, has more than one pronunciation out of 5,000 common used characters. 600 Japanese kanji has more than one pronunciation out of 2,000 common used Japanese characters. So how do we sort them? So we already see we can use the local compare to make the order more clear. And it also applies to the other languages like Ukraine or Farsi or Arabic languages. So it also works, not perfect, but also works for Japanese. And in PHP there is also a similar international class, but I would like to recommend do not use PHP. You can just use the database query instead because the database will handle it better than the PHP. And I mentioned there are the other attribute that we can sort the Chinese character. So there's actually strokes. We can specify the stroke from the CLDR repository. And also we can use the other phonetic attribute. So they all have different orders. So how do we, at the end, how do we do with the character that has many, many pronunciations? Actually, there's no way. No way. But the thing is the issue only affects Japanese kanji order. And also if we order with the pinyin or the burpo-mufo in Taiwan with the phonetic system, but the native speaker know the common variations of the pronunciations. So they already process in mind when they see the characters. So as long as they are in the same logic, it won't be a big trouble for them. But of course, if we do need a special order, we need to customize with a special array or map to the characters. So I think I'm on time. And just recap, we know what to consider beyond internationalization. And there are different kanji system. And what's the issue when we input the CJK and the sorting methods regarding different locales? There are some very good help from the JavaScript already. So if you are interested in the slide, it's already online. Thank you. OK, so to go to the question. So we already have a question online that was put in at the beginning. So let's start with that. So the Dennis Patzer has, we have a problem in Drupal using the t function. For example, t, parentheses, name can mean last name in the case of a person or simply name in the case of a product. How can this problem in Drupal be solved? So I think it can be resolved by the context. So if you, this isn't in the PHP, so if you can give a context with the option, you will be able to write two different translations in the PO file. There is an additional attribute. You can write the same context for the translation text. So in the UI translation, you will also see here there will be a small help text saying what the context it is. OK, so questions? Anyone? We have very little time, but yes. Thank you. You did present on Tuesday some problems with the word wrapping. Is there anything in Drupal that helps to break the words correctly when you go at the end of the line? There's nothing I know in Drupal, but there are some, because it's mainly the front end. So in Taiwan, we don't like to break line and then show the symbol, like punctuation in the first character. So there are some JavaScript tool to prevent that and try to break more words into the next line. So maybe there are something similar in Japanese as well. Thank you, everyone. If you have questions for Michael and Chris, just step outside the door. We have another session starting in two minutes. Thanks, Chris. Thank you.