 Hi everyone. I'm Emily. I work for a Finnish consulting company, Vincent, and that gives me a little bit of a basis of why I'm a good guy to have talk about localization. Finland is a bilingual country in case you didn't know, and this means that we maybe face some localization issues a little bit quicker whenever we're doing anything than you might, well, in most parts of the North American continent, except here in Québec, but yeah, fundamentally the issue that I want to, that I end up spending a lot of the time that I work on open source, everything is to try and find ways to make sure that this sort of thing doesn't happen. This is something that I managed to dig up when actually putting the slides together for this, and it is by far the best example I've been able to define. I have no idea what might happen when you press the button. That's not the okay button. It's probably safe, but you know, just not sure. This is a bit of an extreme, and you don't really have to actually go beyond any of the scope of English language in order to find issues that really make localization, and the tools in JavaScript that make localization important and effective and powerful. So really, I'm going to be talking mostly about Mark's photos. Mark is a photographer taking photos around here. He might show up at some point here, and he takes a lot of photos, I'm sure. Actually, that's not an exact count, but photographers at conferences take a lot of photos, and it was a nice phrase to come up with in order to use as an example for the kind of formatting that you might want to do with variables coming in, localization in other words, and the first thing you might notice is that when you say 1246, it's nicer to use the convention, for example, to use in English to put a comment there as a thousand separator. How do you do that? How do you then, if you're formatting this as the output of some application or whatever you're working with, how do you then tell it that the date should be printed out, say, in the length of December 2019? So the first part of it, many of these parts, in fact, are possible, and you should be using the power of Intel. This is a part of JavaScript. Most of JavaScript is, of course, defined by ECMA-232. ECMA-402 is the other major working group that works on JavaScript, and what they are fundamentally working on is the Intel object. Number format is one of the best... How many people here have used number format? Intel that number format. Cool. So yeah, you can tell it, you can create a formatter, and then you can tell it to format the number, and you get, for English, for instance, it'll put a comment there. For other locales, it'll do something different for that number. Date time format is another really powerful part of Intel, and it'll get you, provided that you have a date object, you'll then be able to format it as a string. And this part here brings in, in one of the issues of English, there is no just one English. This is what happens when you format a date with numbers, a numeric day, a long format month, and a numeric year in British English. If you go for the default US American option, the formatting is different. So it matters, even within the scope of English. It varies. English is in some ways, because some of the rules of how English language works, it's really good for these examples. Relative time format is another interesting more recent... Sorry, going back. Date time format, who here has used that? About the same house, maybe a bit more. How about relative time format? No. I mean, it hasn't been around even for that long in actual implementations, but it gives you nice things when you're trying to provide a nice user experience and say that it's not this specific date that's two days ago. Somewhat or another similar vein, list format. You give it a list, an array of things, array of strings, sorry. And what you can get out of it is a formatted, a locale specific formatted way of expressing those as a string. A motorcycle comma, bus, Oxford comma, and car. This one, the Oxford comma does not vary between British and American English, at least in this limitation. But how about when you want to put the whole message together? Mark took 1246 photos on 11 December 2019. This specific thing does not exist yet. I'm part of the ECMA 402 message format working group, which is just starting now. And we are just now starting to work of defining exactly how does this thing work and what else is really part of that. And we do not have most of the answers for any of those questions yet. But we're working on it. It'll be a thing in the future, but not yet. What is there is intel.plural rules. Anyone here use this for anything? Yeah, not surprised. What it does is that it gives you a little part of what message format does is that you can give it a number and it'll tell you what category of plurals that falls into. English is kind of simple for normal plurals. You have either one photo or many photos. But for other languages, there are many complicated ways of doing this. And in fact, you don't, as I mentioned before, you don't even need to go beyond the scope of English itself in order to find some of the really interesting plural rules, because there's the other category of plurals. Rather than the cardinal plurals, they're the ordinal plurals. Hi Mark. This slide doesn't have your name on it, but most of the others do. Yeah, ordinal plurals in English are complicated. First, second, third, 1246th. You know how it goes. So this too, you can get out of intel.plural rules, but mostly this is useful for guys like me who are writing libraries that then others use. But yeah, it's there. I mean, fundamentally, there aren't, there are a number of these libraries that then provide the services of message formatting in JavaScript. And most of these have been around for a while. Those are the years in which those projects have each of them started. And this is the list of things that I would currently recommend to be used. It is quite likely that if you're using localization, that the tooling you're using is using one of these libraries at some fundamental level. Format JS and IAT Next are probably some of the biggest things, biggest of these. Globalize and message format are both OpenJS Foundation projects. I maintain message format, so I'm kind of biased in that aspect. Project Fluent is a really interesting project started at Mozilla a couple of years ago and is currently still very much in active development and active use. They are currently, Mozilla is currently porting all of Firefox and all of everything to be using Fluent. Then when you go, I mean, when you are then using these, you are most of the time not using any of these libraries directly. You are most often using something that depends on the environment that you're in. For React, Angular, View, Spelt, all of these have a number of toolkits that provide, that effectively use the libraries that I mentioned before and the ecosystems in them. Then when you go really, really deep down in where do you get the strings that you want for all of these systems, there are a number of translation services. With these, I don't have that much direct experience myself because most of the time in the projects that I end up working with, a lot of the code is actually done by programmers who are writing it or nearby them, but then it gets, localization is one of those fields of application development or parts of application development where it turns out that there is no one right tool for everyone. It ends up varying and depending a lot on the exact specifics of what it is that you need and how it is that you're using it and what is your own history and all of it. So I can't really go that deeply into all of these, but how do you then format these messages? In particular, when you have a string like mark took or variable of a name, someone took some number of photos and then there's, you need to formalize whether it's zero photos, no photos, one photo or many photos and then on a specific date. The format that I would buy phone recommend is I see your message format at the moment and message format, the JavaScript library is of course a JavaScript implementation of this and this is not the only possibility to use as a message formatting language, but the other options are kind of challenging. They end up being over complex and they end up feeling when you're starting to work with messages that are complicated, that have variables and plurals coming in that don't really, they feel like they've been patched in order to support all of the features where a message format is more a solution for the whole thing rather than a solution that's been patched up as time goes on. I18 Next has its native format for this case, it looks kind of like this and one other polyglot.js is another format used by a number of, well not a number but polyglot.js itself. It gets complicated so I would, as a fundamental language for message formatting, if it has any complexity, ICU message format is currently a really good choice. Project Fluent is a kind of an evolution of ICU message format and the people behind it are very much active in the ECMA 402 work and are participating and figuring out what should the intel.message format format look like? Should it be something more like what Project Fluent has done or more like what ICU message format? We're working on it. One of the really interesting things that Project Fluent has done is that it's taken a lot of the issues that come up in the actual practice of using message format because you have translators working with the same files as you have programmers working with and you hear issues and this has simplified some of the specifics there and a really interesting part is that Project Fluent is actually, it's a file format even that they've just specified which allows you influent to refer from one message to another message and use them within the same file. This is not really possible with message format because the message format is formulated so that you're talking about the contents of a message whereas with Fluent you're talking about the contents of an entire file at a time and you can build really interesting things. So definitely a thing to keep track of but still in development a little bit. So coming back to ICU message format, you have these messages and you need to keep them somewhere. How does this work? You could put them in an XLIF file which is XML and this is the closest thing to a standard for localization stuff that we have currently. I'm not a fan of XML. I've never actually used an XLIF file for anything. If you think this is a good choice for you, great. Some of the origins of message format itself are embedded in this sort of file format where you can see that using curly braces for variables and other aspects of this kind of fit in. They don't conflict with the XML syntax itself. So this is one option. .properties files common in Java is another entirely valid option and it kind of works. You can have multi-line strings as well. You just need to remember to escape all of them and it's a little bit tricky in some aspects. Or you could have your messages in JSON files. And this is one point in which the non-human aspects of why JSON is a good thing come into play because you can't have multi-line strings. Even this example, it shall we say scrolls quite a bit in order to show the whole message that's even possible there. And this is in fact why I went and wrote my own YAML library. And YAML is actually a good use for this because it lets you escape the curly bits better than most other configuration file formats, but it's got some ugly bits to it. Hence why Project Fluent went with their own file format because then if you define it yourself, you can define it in a way that you don't conflict with object notation that you have in other languages. But yeah, so you have these messages and this is where I would actually disagree with what Miles was saying in the keynote at the very start of the conference about transpilation because when you have a message in message format and you need to get a string fundamentally at the end out of it, I believe that transpilation is really the way of making this work. And this is what I have done a lot with message format the library, which allows you to transpile the message format source code into JavaScript functions that are outputting strings and do all of that during your build time so that when you actually send the things, the bytes and everything to the client in the browser, you don't need to do anything except just run the function with the variables that you're passing in and you get the string out of it. Which I think is kind of cool and I don't think any other library except message format supports this. So entirely possible thing to do. Message format that next is referring to the fact that the version 3 beta release is rolling. But this functionality effectively that I'm demoing here works also equivalent well with the message format 2, which is what you get if you drop the app next from these. So that's a webpack loader defining for it. So that lets you input from a message.yaml file for instance. And this means that because you're doing it in build time and because you're transpiling, you don't have to worry about any of the cost of needing to in the runtime compile the message format source into a function and then actually get variables into it and all of that. That's happening in the build time rather than at runtime. And it just works. And it does this plugins also for using using it in react for instance. Use message there is getting you the message from a react context and working from there. And I will need to write a node import loader in order to make all of this work in node as well, because that's cool. In the end, when you're starting to work on all of this, you really just have to realize early on enough that localization matters that the there might come a time later on when you do need to in fact make your thing localized in a different language than what you start out with or to just be able to change the strings separately from from your code base itself. And then the most important thing is that if you tag things early on enough with with something like a dumb function that doesn't do anything other than this message function here just effectively returns whatever you pass in even doing just this will let you later on find all of the strings and to hook into whatever the function is and make it work because in the end it doesn't matter what format your messages are in because you can always take them from one structured format and into another one and use them use them from there. It doesn't really even matter what message formatting library you're using because provided if you've wrapped your things into some local function like this or something similar you can do the replacement switch from one system to another one relatively easily and this means that you're going to escape the the the horrible feeling at some much much later time when you're working on your code where you notice that yeah you need to go back and fix all of the literal strings everywhere in your app because yes I I've done that I'm I'm absolutely certain I'm not the only one who here has done something like this need to plug in yeah just do something like this early on and this is just for just generic whatever function I mean in React you can do something equally dumb to start with it doesn't have to be anything as long as you have a something that your your your code completion or whatever you can grab on to then later on modify if you need it but yeah this has been I remain Emily and I care about localization I think you should care too and it's not a horrible beast and make it your friend and and be nice and start to think about things early and don't make decisions that you don't have to make and even if you've made the decisions realize that if you do have some structured content in how you're doing localization it will be relatively easy to switch to another solution from wherever you're starting with and if you're doing everything in build time then you don't have to worry about the cost of parsing file formats that are ugly into functions that you're using in runtime thank you I think we have a couple of like five minutes or so so if anyone has questions I'm happy to stay here and answer them ICU for ICU message format fundamentally supported by ICU for Java and C++ and I am not certain that there are other environments that are also supported but that's kind of yeah I am not willing to commit to any sort of a timeline that I'm going to say because honestly the message format working group has met once the next meeting is on Monday so it's very much getting started and figuring out where we are and really what is the expression is it really intel dot message format or is it something else that is going to be the implementation of it whether it's includes the DOM whether it's something entirely different don't know you this one then you're entering the question of when you have a string and you're localizing it from one language to another one what do you use really as the key there and there are two schools of thought effectively on this one whether you use a full phrase in English for instance English makes sense because the code itself is effectively English and use the full phrase or use a key to look it up I'm proposing here to use a full phrase like this because that doesn't mean you have to actually make a decision you can figure out the answer later if you use a key here then that means that you have to have somewhere where you're looking up the value of that key and then working from there yes if that's how you decide to do it later the key here is that you have the message call there and then if you're refactoring your code later you can find all of the places where you're calling because of really I mean you're going to have the message in your input you're going to import the message from some other file and then you can track all of those inputs track all of those uses and then refactor those rather than having to chase down all of the all of the just strings that you have in your code thank you very much