 So now, today, we welcome to a talk where we will learn how to break things beautifully and creatively with emojis by this foreign unicorn. She's a web developer, my profession, and someone who likes to break things apart just for the sheer pleasure of it in her free time. So without further ado, please. Yes, thank you very much. All right, so yeah, I'm here to tell you a bit about emoji, and as you can see, Chromium already capitulated for me while showing the far plan. It should have been this emoji, which is coincidentally, or not so much coincidentally, the emoji that seems to break the most apps, because it's pretty new and it doesn't work in the way that other flag emojis work. So something about me, yeah, 20-something professional web developer, hobbyist, security researcher, whatever, also do game development. And I don't actually like emojis, so there's like three emoji that I use. You might ask yourself that, why am I holding a talk about emoji, why kind of a coincidence I just wanted to try something out last year, and it broke horribly, and I found that was interesting. And also, I don't know if you remember this new story about yields that forgot what humans are during the pandemic. Yeah, that's basically me, so bear with me if I'm talking a bit too fast or something like that. All right, so let's start with the thing that got me interested in emoji breaking things in general, which is emoji domains. A quick bit of history about emoji domains is that, or about domains in general, is that DNS was standardized in 1987 with a very limited character set. So like just letters, numbers, and a handful of special characters, that's definitely not enough for many languages. This actually was before Unicode, which we mostly use today, which also includes all those emoji came out. In order to overcome this issue with many languages not working with this limited character set, internationalized domain names were proposed in 1996, introducing this so-called puny code, which allows encoding characters into this small set. Then some people registered emoji domains, those don't seem to be puny code. They probably bloke something just to register those domains, but they seem to still exist because puny code was only really added in 2003. So there's also normalization in addition to puny code, which isn't that important for us, but it normalizes some special characters to two letters within the set. Then in 2008, we had IDNA 2008, which banned emojis in most major top-level domains. And also in 2020, we got the transgender flag pride emoji, which, as I said, already breaks the most things. So what's puny code? It's this weird nonhuman readable representation of Unicode characters. Most browsers will translate these emoji if you enter them in the URL bar into puny code, and you will no longer see the emoji because otherwise it would be used for spam like I could buy apple.com with letter emojis and people would think that I own that domain. So yeah, it will be converted into this unreadable mess. It's not supported in most top-level domains, but you can basically set anything as the third-level domain so you can have fun with that. So what I was planning to do. I was trying to find a new job in early 2021, and I just needed something to show some kind of portfolio, and I knew that emoji domains were just a thing that you could use. And I also thought, hey, transgender pride flag emoji, let's use that so I won't get mixed up with potential employers who would have a problem with that. So I searched for domains, and at this point I knew nothing about puny code. So I wanted a .dev TLD, which is also the TLD of my blog, but they don't support it. I later found out my register doesn't support emoji domains at all. So I went to another register, bought a domain, and yeah, just so you don't try this. It expired because it would have been like $40 to renew, and this is only like a little testing thing. So I've got this domain. I was really happy about it, and quickly put together a demo page in pure HTML and CSS, so it was easy to deploy. I deployed it, and suddenly time out it on me. Why would the time out my server isn't slow? And nothing was in the lock. Was I scammed? Did I not get the domain that I wanted? Did I waste 10 euros at a point where I didn't have reliable income, which would not have been good? But in reality, a domain did exist, and it actually resolved to my server no problem. However, zero with joiners. This is a character that you can insert between two characters to make a third one, which is used to make flag emojis. The representation of the transgender flag emoji is a white flag and a transgender symbol. So yeah, I had a white flag and a heart, a white flag and a transgender symbol as the domain, which doesn't look as good, I think we can all agree. So the zero with joiner was not allowed in domain names, but it has a punicode encoding, at least depending on which encoder you use. I can set arbitrary strings as sub-domain names, so I tried setting one. Firefox did convert it back then, and it worked, but now it just goes straight to the default search engine. It doesn't even try it. And well, at least my block is the first result, so there's something. Trimium does convert it. A year ago it was this unreadable string, now it's another unreadable string. The first wasn't valid punicode. I don't know how that happened, but they fixed it. Then I've tried DIC, which is a DNS command line tool. It doesn't convert punicodes, so I had to do it myself. It also has validity checks, which you have to disable in order to look up DDNS records, but then it does actually work with the punicode. I just copied from the Chromium browser. So then I thought, okay, I've got this registry, which does not support punicode, but I can also move domains between registers, and I'd like to have all my domains at one place. Back then my main domain landlord, Namecheap, did not support .ws domains at all, .ws is like a TLD that has supporting emoji as its main selling point. Now apparently they do, but they don't like the punicode, so I still can't convert. I still got two registers to support. Email, another thing, email just loves to break. It breaks on its own, so you don't even need an emoji. But yeah, so first of all I thought emoji domain as a mail standard domain with punicode doesn't work because my mail server breaks. Why creating it? At just 500. It's a known issue, it's supposed to be fixed, but for some reason it still doesn't work even though I'm on the newest version, so sadly I couldn't test that yet. A punicode in the local part is not interesting because email addresses allow so many characters that they probably shouldn't because you probably know how difficult it is to write a regex just for email addresses that doesn't have false negatives. So yeah, you can just use emoji without encoding, and we've got a couple of results. So all of these are the transgender thread flag. Here we used the zero with join up apparently. At the first example, at the second one it also just shows a white flag. And the funny thing was when I tried sending an email to Outlook, I think I'm now on a spam list with them because they really did not like that. I don't actually know. I didn't actually know this SNTP UTF-8 thing that they said I was lacking. But yeah, they were pretty unhappy and my next test mail actually landed in my spam folder which never happened before for myself, but whatever. But domains are not the only things that you can break using punicode. Also for a little tell from a job I've left a while ago, one of the first tickets I got, the input sanitation was broken. People were entering emoji into their username and the database didn't have the correct collation to support that. And it broke. Input sanitation actually was fine. We just had two different pieces of software that allowed changing your name. And we didn't have access to the other ones, so they just fixed the database and my username stayed Jennifer Cool for a while. At least while I was there, maybe the account was still there. Maybe they still use it to test emoji. I don't know. GitHub, this was also interesting. They have these statuses which are an emoji and some text. And they did not fully support emoji 13 which introduced the transgender pride flag last year. But the interesting thing is, depending on user agent, it looked very different. The one that looks correct is iOS. Then we've got Firefox which does the whole writing the text two letters in a row. And then we've also got Chromium which overlays the text of the oil text for the emoji. It's been fixed now, which makes my status a lot less fun to look at. So iCloud Keynote, we're getting meta. It's not an issue with installed fonts, which I actually checked. It does not support zero of joiners seemingly. It doesn't happen in presentation mode, so yeah, at least that was safe. But it did happen in the edit mode. And it happens in Firefox and Chromium on Linux and it happens in Chromium and Safari on macOS. Probably also in Windows, but I'm not going to try that. It does not happen with the iOS and macOS Keynote apps. So yeah, staying meta, breaking this event. I did not break that screen, but yeah, as you can see, the Fabplan is displaying a broken character next to the talk, which is actually better than if it had displayed the correct emoji in my opinion. Also apparently it generates many hours of work fixing the intro-outro generator and I'm sorry about that. That's the not-so-fun part about breaking things. Someone has to clean it up. Yeah, substrings are another funny thing if you've got two flags. As I've shown before, flags are made from two letters which are joined. So one country flag may be a substring of two other country flags. Not really that dangerous, but at least interesting in terms. Then I've got OnCloud, which started out strong displaying my folder name correctly, fades a bit on the Linux client and seems to have given up once I reach the sync view where it's only displaying one of the emoji. Then on Mustodon, Mustodon also allows custom emoji, which are like a text and then you load a little image file and someone decided to put all custom emoji their instance had in a single post. That post managed to crash basically every client trying to render it. Mine included. I couldn't use the network for two days because it kept trying to render this post. Also, there's this little bug, which I've got a gif here. When I try to paste the transgender flag on Chromium on Linux, in Firefox it works, but Chromium it doesn't. It does not know this combination, renders the white flag into unknown character symbols. When the zero of Jonah is deleted, the transgender symbol is shown, which is just an interesting way to handle combinations that they don't know. I don't know why Chromium doesn't work with emoji. Actually, the package I got installed to display emoji is from Google, so I don't know. Device names seem to break a lot of things if you include emoji in them. Actually, I changed my phone's device name, so far I haven't found anything, sadly. Maybe in the future. Payment providers. This is something that happened at work, where a large payment provider sent empty responses with a 200 status code, so no information what broke, and later we found out from their side that a rainbow emoji in the order description seemed to be the cause. Once that was removed, it worked again. What gave me the idea of adding an emoji to my device name, apparently some banks won't let you use online banking on your phone if you've got an emoji in the device name. That's definitely a sign of very stable backend code that you want to trust with your money. Not as fun. If you can break with emoji, screen readers. Too many emoji are terrible because they tend to have pretty long names and a screen reader will speak every single instance of an emoji, so if you've got a username with tons of different emoji in it, the screen reader will spend maybe a minute just saying your username before your post. Also there's art made out of emoji that sometimes breaks on its own. So we've got a bunch of blanks there. It also causes screen readers to really announce everything. I checked with this post on VoiceOver and it was really terrible. That bot does not exist anymore, so don't go complaining to anyone because of it, because it doesn't exist anymore. So I basically think that accessibility is just more important than fun with emoji. You can, if you want to create art out of characters, just take a screenshot and write an image description that way everyone can join in on the fun. But why would I do all that? Well, first of all, it's easy. Someone has an emoji keyboard, phones have them, operating systems have them, and it's just easy to just take some emoji and put them everywhere. In every single input feed you can find. It's not likely to get me sued, like there's other ways of breaking stuff that are more likely to actually break something, but unless they've got a terribly messed up backend system, generally not much happens except the displaying, breaking. And yeah, I'm just curious like that. Did we learn anything from this? Well, a bit maybe. Emoji domains are definitely not production ready. Users will be confused when they just see this unreadable string, they think they will be redirected, they maybe think they've been hacked or anything. Don't use them for anything important. Not all browsers can resolve the emoji domains. You might not get the domain you think you're buying, generally only for fun projects, not for anything productive. But emoji overall are just here to stay, so test your applications with them. Most stuff breaks without a larger impact. As I said, it won't get me sued. Yeah, making it even more interesting when something really breaks. And also, supporting emoji has the nice effect of actually supporting useful things, which are scripts that are not Latin. So many people have names that may not be able to be entered into your system. And that's pretty shitty because people are just named what they are and you should support that. So yeah, if you support emoji, you usually get support for all names in addition to that for free. And also domain renewal costs are just way too high. It's bad when you can get a domain for like 8 bucks but then the next year they want 40. So you've probably linked that domain everywhere and now you can't afford to renew, which just isn't fun. And also leads the way for spamming and stuff like that. So yeah, I'm a bit too quick this time, sorry for that. Thank you for listening. Go break some stuff and have a wonderful remaining conference. Thank you very much for your talk. Thank you for listening. We still have 10 minutes. That means if you're willing, we can take some questions and have some Q&A. Okay, yeah. A very technical question about Unicode. Who came up with this idea of creating new characters by using the zero width joiner? And because one of the good things about Unicode was that it got rid of a lot of really weird encoding systems that use multiple bytes to define certain symbols. And so now we have this one 32-bit number that can represent enough characters to probably last us for more than a lifetime. And then they decide, okay, that's not enough. We want to combine multiple of these code points to define new characters that seems really weird to me. Do you know why they decided to do that? Yeah. So basically the history of emoji a bit shortened is that it used to be used a lot in Japan for communication via SMS. It was never meant to be this thing that we got today where basically everything has an emoji and there's every emoji and lots of different skin colors and all that. And at some point the Unicode consortium was formed, or it was already formed, but at some point they adopted emoji and they kept having to add more and more things like these country flags. Every country wants to be an emoji and every single one would have taken up a code point and maybe an update or something and now they just take the ISO string, the short code for a country and join that to make the country flag. If I remember correctly that was the first usage of the zero of joiner. Later they used it for skin color indications, which also made sense because having every emoji and like five different codes would be a lot. And then they added the rainbow flag, which was a rainbow and a white flag. And they kind of kept going. Other questions? Yes? I'm not a question, but just about the topic, the GPN-19 I think there was a presentation about emojis and I don't remember that well, but I think he got into how it all works and how it came to be. So you might look it up. It's online. You can always watch it. Oh yeah, in the back. Was there also something that didn't break when you expected it to break? Well, I tried to look at the world in the most optimistic and positive way so I just don't think things will break. Was that believable? I don't know. Yeah, I don't have an example, but I probably tried some things that I thought would work that never did, but I don't know what exactly. Okay, thank you. We have still time for questions. Yes? Have you tried using emojis and passwords? Well, all my passwords are auto-generated and my password-generated does not yet have an emoji option. So I've got to disappoint you there. Probably security is a bit more important to me than breaking their password fields, but yeah, that's definitely an interesting thing to try, maybe not for your main account, but yeah. It could also be funny with maximum lengths because an emoji may be multiple characters, but it's not always treated like that, so. Hi. You know Unicode also has this left-to-right, right-to-left thing. Yes. Did you fiddle around with that one in combination with emoji? Not yet. Okay, we still have five minutes left, so still time for questions. Please try to keep it concise. Did you find a job? Yes. I don't actually think my portfolio mattered, though, but there's a question back there. Well, my question is about Punicode. Did you experience many broken implementations? Because when I experienced with Punicode, I had a lot of tools, some kind of double encoding happened. So you put in Unicode, you get out an emoticon or emoji out of that, somewhere in between, and then there was some kind of double encoding. In times it was Punicode, and I was scared sometimes what happened. Was this a comparable issue when you got different results, when you copy and pasted the URLs from Chromium, for example? Yeah, it's like that. Many converters are broken. The website that let me buy this domain that would have included a zero-width joiner has a broken converter, otherwise it wouldn't have let me buy that, probably. So Firefox and Chromium, Chromium used to just do incorrect Punicode. Yeah, it's a dumpster fire, basically. Okay, then. If there are no further questions, please one more round of applause for our speaker.