 Alright, so I'm going to get started. Thanks, Adam, again for the intro. I'm talking about the UX of language. There's a lot of stuff on the floor here. I'm talking about the UX of language and I'll briefly describe what I mean by the UX of language because that may not be immediately clear. So a preface is that there's a lot of pressure for someone like me who submits a talk called the UX of language to speak well because my user experience should be good. But I'll warn you that that's not me. And also there's a misconception that since I'm talking about translations and internationalization a bit that I speak a different language. Someone told me that that was the way to say that. I don't know. I do ever do a convincing Jersey accent. So we could maybe put that in. So we'll call me an enabler of people who can speak well. So that's my preface. The other thing is that this is less about internationalization and more about user experience. So it really it comes down to making sure that the language in your application fits your users the best whether they speak a different language or not. The first step to internationalization if you speak English is getting English right. So step one. The way that this will relate to you and your applications is that anytime you output a string and right now you probably just use the plus operator to concatenate some data into it you want to consider the problems that I'm going to address here. So that may look to you like there are five results or you have three new messages. Welcome Adam to come see X data or your score is 25. You need 25 more points to continue if you're developing games about me a little bit. That's poor translation pluralization rather. I like to build JavaScript programs. I work at a place called bizarre voice in Austin. We do like ratings and reviews and stuff on lots of retailers that you probably have heard of. But the idea is we can put our ratings and reviews in any language on anyone's site anywhere. So I get to do this for a living. It's fun. I really enjoy it. My mom was an English teacher. It was like little kids though. So I don't know if it counts. I have an eight inch vertical jump. I bench 115 on a good day and I can score 250 probably I never tried. So getting into it. This is I wrote a blog post if maybe like two of you have read it. And this is the premise that I worked off from the blog post. So whenever I went to write the talk naturally I just took a screenshot of it easier that way. So Greg Brockman I have no idea who he is but he wrote this funny thing and then a thousand people retweeted it and I guess one of my friends did. And I thought that was funny more or less. So a brief history of string concatenation is where we start. So we'll go back to 1986. You have like text based browsers and the web was for documents right static files you could you could get to TXT files. There were no formats. There was no HTML. There was nothing right like ARPANET or whatever Al Gore was working on. So some time passes. I wish I could have put like a what is it like a montage like Rocky. And people desire to customize content based on people. So different people on the same web page would experience different things. The tendency we're talking about mixing data and documents. This was a fundamental shift in the web and that's why we have web applications and games and things like that now. I think Tim Berners-Lee has a quote somewhere that I totally bothered to look up but it just got lost in my slides that he never intended for the web to be dynamic. He didn't even intend for it to have images. He's like why would you need images. And then later when they try to add like authentication he's like no the web is open. You don't need authentication anywhere. It's crazy. So we branched a little bit from that but I still like the guy. So then we enter the era of literally just outputting concatenated strings. You might remember CGI bin folder. And this is just some headers, some HTML, some strings. This is my first web application. 1993. I was nine. So and then after that there was more time passed. I figured the montage was too long for twice. But then we get to crappy templating languages. There was a language called personal home page that came out later back renamed into something else that I don't know what it is. I think it's like one of the recursive ones. PHP, something, something. I don't know. It's weird. But you get into these situations where it says like your template is also the application. So you just have an if block and you're like if they're logged in show this. If they're not logged in show this. And then go here. You can have like one file that was literally the whole application including the markup. And that was like ten years of string concatenation dark ages. That was the non-innovation period in the web. And it was where like half of us started. And I'm glad I don't work that job anymore. So around 2004 to present time, you know, Django and Rails and all these, even PHP came around, especially with PHP 5. There was a lot better mechanisms. PHP really was a template language first that just got out of hand. But now there are real, you know, first class language things in PHP. And so you can do things like MVC. MVC is not the only thing. But you can separate, you have separation of concerns. You can actually use good software development practices. It's kind of like there were the software engineers and there were people building web pages and they never talked. And so finally you can do things like logic list templates. And I know not everyone loves those, but I think everyone can agree that you don't want too much of your application logic inside your templates. That's not really what I'm talking about. Really what I want to say is that at some point we're pretty good at rendering HTML right now. We can separate data. We can separate views and controllers and have Timberling languages that we like. And it all feels pretty good to us. So we can finally render HTML. We're out of the that type of string concatenation modern iron age. So what about rendering a sentence? So that's the basis of my talk is how do we render sentences? Do our techniques that we use in templates work for sentences? So let's try it. Here's a templated sentence. Handlebars. Oh, you want to, I guess that would work. That? All right. If I, maybe I can go bigger, we'll see. So this is a handlebars template because Yehuda's here and I didn't want him to beat me up. Also, I love them. So welcome to the site username. Seems good. So you put it in. Welcome to the site, Adam. It's nice. So let's try another sentence. Hi, Adam. You have blank new messages. Seems good. We'll give it a shot. It's not good. Alex, you have one new messages. Not that bad, but it's not exactly my favorite thing. If an application told me this, I would probably disrespect it. It's gross. Hence the UX of language, right? That is poor user experience. So our traditional way of concatenating strings together is not sufficient when those strings form a sentence. So the way we create HTML can't be the way that we create sentences unless those sentences are completely static, in which case that's fine. Do whatever you want. This is true if you support multiple languages, but still true if you only support English. And trust me, if you only support English in like four months, you'll have to support other languages and you'll be really mad that you thought you never had to support other languages. Because that's a terrible time. Re-internationalize like something. So this is a rhetorical question. It's supposed to be poignant. So how many of you optimize the markup on your page to be like pretty indented? So whenever you make sure the template prints out to where you have like the correct indentation in the source of your page, probably like half of you at least. Or maybe I'm just weird. And then we do things like have sentences that suck. So it's weird that we're so worried about the source, but we can't actually get the thing that the person, the content, the thing that someone is actually there to see. And so that's the poignancy. That's not a word. So we can do better. So consider the previous example of you have two new messages. That one works fine. What if we just did this? You'll see this a lot. So you have X new messages. Messages is in a, the S is in a parentheses. And messages you have to, which is like the Yoda version, right? Like in reality, people would just put like messages colon two, but it wasn't as funny. I couldn't make the Yoda joke. So the first one was just ugly. And it doesn't work for a whole slew of things that I've yet to bring up. And the second one works okay, but not if you care about user experience. This works well if you have like tabular data, something like that. But this is really impersonal. Consider Facebook. Two people liked this post, right? Can you imagine if in the bottom of every Facebook post, it was just liked colon two, right? Or even if they wanted to put people who have liked this post, Alex Sexton, comma, Adam Sontag, comma, two more. It's just weird. I don't know. It doesn't work. The second one's called prefix or post fix. A lot of times this works well for internationalizing things because the way that the sentence structure works is you can almost do that in most languages, but it still kind of sucks. I just realized there's like a C of apples, but it looks like stars. So it's like Apple stars. Someone should take a picture of that later. So should you do this in your application? Absolutely not. So we'll get to plural branching. This is like, once you realize how much prefix and post fix pluralization works, you can try this, right? So you're like, hey, Alex, what if I just said, if it's one, do this. If it's this, do this. I hope someone just hears the audio and that sentence works well. So this is cool. So let's add Spanish. You might be saying, I don't need Spanish. It doesn't apply to me. Just pay attention for a little while and get back to you. So here's that same thing with Spanish. Hopefully those translations are good. And you can tell it explodes. It doubles immediately. It more than doubles because you need if statements for this kind of stuff. And that's really gross. The other thing is it's coupled directly into your application code. Where do you put that? You put it in your template. If you're using a logistic list template, you can't really do that. Where does that go? There's no good place that that fits. Plus, consider these commonly supported languages. Can you imagine the if block for each of these in the case where you had to support any, you know, half of these or five of these every time you needed a sentence? That just gets unwieldy. So I guess I was feeling kind of sorry. So I know you're thinking, or at least some of you, that Richard Stalman came in. He's like, can you get text 30 years ago, solve this? Everyone uses get text. This is a solve problem. That's what I use. And that's cool. This is the reference. There's Richard Stalman right there. He's fighting ninjas. It's like an XKCD or whatever. So I thought so too. So we'll get back to that. I just thought I'd interject that because everyone likes a good comic. So navigating the message landscape, I want to go a little deeper, talk about properties files. So this is the next thing people tell me, oh, well, I know I don't do the if statement stuff or whatever. We have internationalization. We use properties files. And a lot of times we'll do it for internationalization because the way it works is you can kind of switch out your messages nice and easy. A lot of times, the last people like, do you have internationalization support? Let's say, yeah, we have properties files, which is like the equivalent of me saying like, is this TV high definition, and then someone saying, yes, it has pixels. It's not a no, but it's not a yes either. So it's not necessarily what you want. But here's how it works. So you have a properties file for each language. In this case, it's one file and they're just objects, but whatever. And I was like, that joke wasn't funny. So this is the way they work. And then in your templates, you pull them in as like the properties data or whatever you could say it, i-e-t-n or whatever. And then you have button text and you have invalid message. And they go in. And that kind of stinks because not only do the translators like not have any reference of where it goes, like when you write your templates, like, what is that? When you have strings in there, you know that this is like the submit button and what the invalidation message is. And so this makes it hard on you. It's pretty common in a lot of places. So by default, a lot of languages come with the ability to do this. They're not bad inherently, but if you want true internationalization support, you need more. You can use properties files to do more than what people generally do with them. So it doesn't really address any of the hard problems except for rendering different locales. So now you can switch out the strings in one Spanish and one's English. And so that's a totally different thing, but now what about pluralization and stuff like that? So sometimes people add sprint def. This is like an old C function that we keep around. It's not in JavaScript by default, but it's in a lot of languages. It's implemented kind of okayly in like underscore string. And this is what that looks like. So the message inside the properties file has like the sprint def parts in it. And you have X messages. And we're back to the initial problem of what if there's one? You have one messages. And so it didn't really solve much in that case, except for now we can poorly translate and pluralize every string instead of just English. And our templates suck more. So we can switch out messages. We can interpolate, is how you say that. But they're incorrectly pluralized. They don't have gender. They don't have context. And they don't have a bunch of other stuff that you probably end up wanting. Not to mention, offering templates sucks. So we're back to Salmon. Get text. Get text is pretty cool. It's the way they internationalize a lot of the Linux stuff. Salmon didn't write it. I don't think at all. I'm just, comics works better if Salmon wrote it. So plural forms and PO files. I thought that was like a good album title. Like have I ever wrote an album? Get text. Plural forms and PO files. So plural forms are pretty cool. It decouples the data in the message, but also it decouples locale specifics. And this is what can really make a difference in starting internationalize your app well. Or not just internationalize. Just have good messages. So this is kind of a crappy version. Well, no, it's like a way better version of a PO file. Describing the sense that it wouldn't work. But it's very simplified. So there's like this old PO file spec. And it's all just raw text. And it's like half arrays. And there's equal signs and just random characters everywhere. So this is like, if you were a sane person, how you might do a PO file these days. And so essentially, there's the messages. And then there's a plural form section. And then you tag it with the locale. The plural form says this is the index of the message that you would use whenever you pass the number in. So if you pass in one, it'll return zero. And so that array that has, there's one message in it, the first one will be picked. And so it would pick there's one message. And in the other one, it would return that. And so you could sprint to F it. And I do it nicely. So in this case, that's me doing that. So you just pass in the message count to the result of that. And if you pass in something to the first one, nothing gets replaced. It's just there's one message. The other thing to notice here is that the key in this file is actually the string that is the singular function. Is that me? No, it's someone. And this is nice because whenever you add these keys into your templates, then you end up with the real strings in your file again. And so that's pretty cool. We used to look at that. So this is the English plural form. So one is the only thing that changes the way we pluralize a sentence. The Spanish, similar. Exactly the same actually. French is a little different. Zero is the same as one. So like zero and one work one way. And then everything after that works with the plural form. Slovenians crazy. It's like mod 100, the end is one, then do something, and then two and three or four, otherwise do it the normal form. Which is pretty nifty. If you think about it like the thing that seems the most like to me is like first, second, third, and then everything else is th. That makes sense, right? But I don't know. I don't care. I don't translate anything to Slovenian. So this is what it would look like inside your file. If you notice this is handlebars, this is a regular helper. And the helper is underscore. And you might be like underscore. That doesn't make sense. Underscore was get text way before it was underscore js. But now if you search for underscore, I guess I won't search for underscore right now. It literally underscore the JavaScript library beats underscore the thing in Google. That's cool. The JavaScript. So this is cool because you can see this and you can actually write this straight up without doing a PO file first. And if these keys don't exist in your PO file, then we'll just use the ones you put in there. So it kind of allows you to write everything you need in your template file. And later, if you need to switch them out, you can create the PO files and do that kind of stuff. It'll also do fuzzy matching. So if you miss a comma or a period somewhere and you put it in, it won't just break and you have to go switch out all the keys and things like that. However, that's really difficult to do in JavaScript. So most people don't. Gender is a little different. This is kind of a hack that people do with get text. And so what they'll do is they'll say there's a context of the gender. And the way that context works is it just prefixes things with your context. That's not really the context delimiter. It's like some weird unseen Unicode character. So it really sucks to try to show you guys. So it's not an asterisk. And the way that works is you just send it and he liked it. And that's the key. But if the key is female, then it will come out as she liked it. And that's fine. It means that you can see one example of it inside your template. And that's pretty good. Also p for some reason means context will go with it. So a common mistake is that you'll take two translated things and you'll concatenate them together. And you can't ever do that. That's bad. So this is why. I like my red shirt and me gusta mi camisa roja. So awesome. That was really good. So these are scritched around. This is a fairly contrived example. I don't know how many people are going to be outputting this sentence. But it actually gets way worse in some other languages that I assume all of you know. So what do you have now? We can pluralize our message. We can interpolate our message. That's good. We can pluralize our message correctly for each locale separately. We can have templates with messages kind of in them. It doesn't actually have to end up mapping, but at least readably so. And we can use gender specific words as per the locale requires. You can actually also add a rule for gender in specificity. Which is it's like whenever someone's too far away to determine their sex. So if someone rides a horse over the horizon, they rode a horse over the horizon. English barely has that. They is ambiguous. But it works. So Get Tech seems to do a good job of some of the things we've been talking about, which is interpolation, gender, pluralization, and happiness as far as the way it looks in your templates. And that's why I wrote Jed. Any of you know Jed Schmidt? He lives in Japan. And he's a translator. But he just writes JavaScript for fun on the side. And all of his code is better than mine. So I named the library after him. Because, hey, he has a three-character name, and that's short to type. So this is the way this works. I kind of have two APIs on top of Get Text. The old, the actual Get Text APIs on the bottom. And the new one's kind of like a cool jQuery chainable thing. And that's fun, and it works. And it's more or less what we've been talking about. So what's wrong with it? Because it's like not the end of the talk yet, and I still have more to say. So back to just English. There are three messages from two people. So Get Text can't handle multiple plurals. That function takes one number, and then it translates that plural form based on that one number. So what if also I have a real context? They didn't put context in there for gender. Context is for like, I'm translating this for when I'm in the menu, or I'm translating this for when I'm in this part of the application. And they can kind of have slightly different contexts, like look at this, or look at this, which is the same sentence that didn't work well. So I can only go one level deep too, which is kind of indicative of why you can only have one plural. So you can't have two contacts, you can't have two plurals, you have two anything. It is a CAPI, which sucks. This is all the functions. There's another like function overloading from back in the day. So you have to have each one of these things, and you have these like obscure characters. And you can override that like I did with Jed, but it's still kind of sucks. Plural forms have to be sent with each message. So when someone sends a PO file, they have to put the plural form in. And why? I don't know why. They shouldn't have to. If one word changes, then you had to retranslate the whole sentence, because we can't do fuzzy matching very well in the client side, because it would take too much guessing and time and things like that. So in the client side, it's not even that very good solution. It's not very extensible either. There's some things that I'll show you in the future that can kind of show you what I mean by extensible. So I want to take this section to thank Norbert Lindemburg, who is not here, doesn't know I exist, I think. But he is a guy on the ECMAScript, TC39, globalization spec, something or other. And I put out Jed, and I was all excited about it. I did the cool face thing with Jed's face, and I put it on the site. And Bren and I posted the library up to ECMAScript, and he's like, hey, internationalization guys, how does this look? And they shit all over it. So I want to thank Norbert for being the kindest person who shit all over my library. Rightfully so, though. So next up, he convinced me to switch to something called ICU message format. ICU, they do a lot of Java stuff, actually. There is a message format in Java, but it is not ICU message format by default. So if you have preconceived notions of message format, leave your baggage at the door. So really, it's not even a thing, it's just two things put together. Plural format and select format. And the way that they work is that, well, the way that plural format works is it has a mapping of keys. So any plural function, the plural form functions that, like for English, we had zero and one, they'll output any one of these things. But the minimum set. So in English, the only thing that matters is actually one. Everything else is other. Even though we have a zero and we have a two, exact literals, they don't actually matter. They're not different. So in English, we have one and other. In some languages they use all six. They don't always exactly match up. You know, few means different things in different languages. But you can pretty much tell the translators that this is what it maps up to. You kind of have to give it some wiggle room. You're nice to it. These all come from CLDR, which is like a database that stores plural functions and things like that. So thanks to those guys. It's an XML though, so just convert it. So the basic syntax overview for how this works, it's, it looks pretty complex, which is crappy for translators, but it's, it works really well. And so you have a literal and you can just type it. That's just any words that you want to put in the sentence. You can just type them. If you want to do something a little crazy, if you want to use a variable, you can just put it, put brackets around it. It's like it, like mustache, right? And then if you want to use a plural select function, use this comma section, and then you end up, let's see if I mouse. Yeah, I do. Bigger it doesn't, it wraps. Thanks, Adam. And so you have option one, option two, option three. That would be like a select. And so if those are the, the value of this variable, then these will run. And then it recurses. So inside here, you start up here again. You can have a literal, you can have a literal variable, or you can have this whole thing again. And then go infinitely recursive, which I don't suggest. It would take a while. So let's try this sentence again. So we have this message. We say how many messages are there? That's our variable num message. We're saying that this is a plural form, a plural select, plural format rather. And then we're saying there's one and then there's other. And then this is what comes out of it. You have one message. If we wanted to change this to two, we come here and recalculate this, and you have two messages. That's pretty nice. That can be pretty good. You can also come up here. Oh, it's hard with it doing that on my monitor. You, let's say don't have two messages. Yeah. All right. So that would be like the worst message. You might have any other number, but it's definitely not two. We could guarantee that. So if we can add literals in there too. This is where things get cool. So a lot of times you want to treat specific numbers differently. So you have one message. We do that again. But if we switch this to zero, rather than saying you have zero messages, which is fine. We can make it a little more warm and say that you have no new messages, which is nice. And so we have a special case for zero and it overrides these run first and then the other ones run after that. So we can come in here and add, I'm not going to do the formatting nicely, 42. And the answer to life, blah, right? And then we can come in here. It won't do anything, right? And then we can change this to 42. Right. Messages. I forgot to add messages, whatever. You get it. So for those you have been saying, I don't need multiple languages. This is what applies to you. That's just English, right? So before you couldn't do that in any language, let alone English. So that's five results in two categories. Or even worse, he received five results in two categories. And that's like, you say, well, that never happens. But that's like everything Facebook or Google plus writes on your screen. If you do this with the if branching, you have to branch on all three gender choices, as well as each pluralization of category and results. So it blows up to 12 different if statements just for English. And that's not maintainable. So if we want to do this in ours, we just use these same forms twice, right? So you can say that there are five results in two categories. There are five results in one category. There are let's do a crappy one, right? There are zero results in six categories. Also very unhelpful. So we'll go on. Select format works similarly, rather than the pluralization function choosing the key, you just get to choose the key. So that's how we can do gender. So we can have the same one as before and we say the gender. This is a select format. And then we'll select based on these three things, right? So he found five results in two categories. And we can do she and we can do my good friend Noel. They found five results, two categories. That's nifty. You have named arguments. This is just like a temple language. I won't play around too much with that. And then plugins are nice. That's totally readable. Everyone's good with that. So we'll move on. More or less, the key here to focus on, let's see if I can do this, is this guy right here, the offset. And that's actually pretty cool in the long run because you can take the number that you're passing in and do an offset. So this is the kind of extensibility that you don't have in other translation and message formatting stuff. So he added himself, or I could come in here and say he added, or he added himself in one other person. He had to adjust himself. He added no one to his group. He added, oh, this is the first one we did. And then we'll do it here. Two other people. We can change it to female. Well, we'll change it to my wife, Alan Al. Yeah, that's cute. Ali added herself and two other people. And then we can do like if we didn't know again, and then you didn't know which of us, right? So like that's pretty cool how on every single one you can do all the different things and it just comes together. It's a very technical way of saying that. So this is translator friendly because all the variations are translated together. You put all those together and you make sure each one of them works. You can automate the just displaying each one of the data points just by some dummy data. So it's also not very translator friendly because the syntax is crazy and they already speak like Chinese. You can't expect them to speak select format as well. Luckily, a tool wouldn't be that hard to build for this doesn't exist yet. But it would be pretty easy to build. So I wrote message format JS and that's what I've been using this whole time. And so you can pull this in. It's compilable. That's pretty nifty. What that means is I used peg.js in order to parse the grammar. This is a parser expression generator. You could use Gison or something like that. It just generates an AST of the different nodes and then I can take those and do some cool stuff with it. So this is our argument one from before. If you look like we do a pre compilation rather than a compilation, it'll output JavaScript. And the cool thing about this JavaScript is that it doesn't actually require message format anywhere in it. So once you compile your messages, it's just the JavaScript that's left. And that's pretty cool. And so we can come in here and we can say this was a named argument to us is my favorite word. And you'll notice that this JavaScript changes in here. And that's pretty nifty. It works a lot like handlebars pre compilation. And to that point it actually you can put it directly in handlebars. You can give it some random key and then in here, you can just write it raw. And then I have a plugin that I haven't released yet, but will probably within the week that will take this pre compile this and pre compile the template. And everything's pre compiled and you can just throw away the handlebars parser and compiler and you can throw away the message format handlebars and parser. And that's nice. So what does all mean? It's extremely fast to just wrap your messages with the message formatting wrapper. You don't actually have to use it at all. But you can do it and you'll thank yourself whenever you have to go back and internationalize your app. You can run a tool and I'll just tell you here's all your messages. Put them out in some file format and do it. It's way easier to do it now and just ignore it forever than it is to go back and figure out all of them later. So even if you don't have any intentional that was I've messed up that slide. So date and number of formatting are important. If you're internationalizing to it doesn't it's not really in the scope here. So you you want to look into number format and like moment JS or something like that. Commas as periods are crazy. It's like dogs and cats living together. So there's no excuse for sentences like this. Premise of the talk. The language on your site is more important than the markup in your site. So pay just as much attention to it. Are there any questions? Any ones persons people's persons. I get it. All right. So I have the require handlebars plug in.