 Welcome to this evening's next talk with the wonderfully broken title, I love it, Emoji Domains and how wonderfully broken they are by a very, very wonderful person, Genois, who is a VEP developer and you wouldn't believe it, her nickname is Unicorn, here is her Unicorn. Hi, Genois, this is, tell us everything about Emoji Domains and why they are so rotten-broken. Jörg. Yeah, thank you a lot. Exactly, I'll be speaking a bit about these wonderfully broken things and the talk will be kind of like, I start with a bit of an info dump about the history of Emoji Domains and what they actually are and then I will talk about my personal experience breaking things with them. So yeah, let's start right off with the history. So DNS was standardized in 1987 with a very limited character set. So you can see like only Roman letters and some numbers and like four non-letters. So these are definitely not sufficient for many languages and it's a very Eurocentric view or not even just Eurocentric, but it's actually very centered on the English language and it was clear that these wouldn't suffice. So in 1996, internationalized domain names were proposed which allow encoding characters that are not supported or that are not officially supported into this very small character set so that browsers could simply convert them on the fly. So sources kind of disagree when this exactly went live or when you could start it, when you were able to use it for the first time. The IDNA 2003 standard allowed the support, but the first Emoji Domains were actually registered in 2001. Interesting about this is that in 2001 Emoji weren't part of Unicode yet. So you can see these are the examples like the Hot Springs, those Dooshers Emoji, which is because they are both Emoji and Unicode pictographs. So not actually Emoji domains at the time, but right now they were kind of converted into emojis back then they were just pictographs. I couldn't really find out if those domains actually resolved if you entered the pictographs back then or if it was just someone who was hoping that they would rise in price once IDNA 2003 or whatever standard would implement went live. So there was also an IDNA 2003 normalization, but that's not too interesting for us because we just want to look at the Emoji side of things. IDNA 2008 actually banned Emoji for most major TLDs because of concerns that it would be used for phishing domains that looked very similar to actual other domains. Like every character exists as an Emoji to be able to make country flags. So that could be used for phishing and they decided to ban it for most major TLDs that comply with IDNA 2003. And just important to my little story, in 2020 the Emoji 13 standard or Emoji 13 added a transgender pride flag emoji. You will see why that's important later. So what actually is this puny code encoding? It's a non-human readable representation of puny code characters. So you can see this symbol here would be translated to xn-c8h, which obviously it doesn't make much sense to type in, but your browser would take care of this. So DNS didn't have to be changed. It's only inside your browser that these conversions happen. Compatible browsers, depending on which browser you use, will either intransparenly or semi-transparenly translate. Firefox, for example, as a mitigation to these phishing attempts does allow you to enter Emoji or other unicode characters. But as soon as you enter it, the URL bar will show this xn-dash domain. Safari, as far as I know, does not do it transparently. So you will not know what exactly the puny code representation is of what you were just entering. And different TLDs only support a specific subset. As I said, IDNA 2008 actually banned it. Oh, fun fact. I forgot on the last slide. IDNA 2008 went live in 2010, which is kind of confusing, but whatever. So, yeah, different TLDs only support specific char sets. Most don't support Emoji, but they are TLDs that have supporting Emoji as their main selling point. So TLDs that most people wouldn't want to use unless they just simply are interested in Emoji. So why did I end up breaking things with it? In early 2011, no, 2011, 2021 this year, I was unemployed and looking for interesting ways to build my portfolio. I knew that Emoji were somewhat supported, but I didn't know what, how exactly it worked. I just knew that there were some people that had Emoji domains. I was happy that there was a transgender pride Emoji added. So I decided, well, maybe it's a good idea to add some domain that contains this transgender pride Emoji to also become less interesting for bigoted potential employers. So, yeah, let's register a domain with that Emoji. Well, that seems to be a bit more difficult because these domains, even though you never really counter them, seem to be sold out. Nothing I looked up worked. And actually the web interface broke a bit, but more to that later. Well, none of these domains actually resolve to anything. .dev does not support Emoji at all, and Namecheap doesn't support Emoji even with top-level domains that do support them. So I had to go to another registrar, which was a bit annoying because I thought, well, I like having everything in one place, not specifically because I love Namecheap or anything, but whatever. A few minutes later, I now am the proud owner of transgender pride flag Purpleheart.ws. At least that's what I think. So I just set up to build a small demo page for it and deploy it on my server and test it. And wow, my server usually isn't that slow at timeouts. Well, the route looks okay inside my reverse proxy trying again. And after a long time, I end up with this wonderful error message. So we're sorry that domain is invalid. So it also doesn't show the transgender pride flag anymore. But that could be down to simply their web font, not supporting it yet because it was just added to Emoji 13. At least that's what I thought at that point. Obviously, I wasn't bit scared because well, I just spent 10 euros on something. And I didn't really know when I would have a stable income again. So I did this to find a new job in German unemployment benefits are really difficult to get. So I was a bit scared. But GoDaddy didn't sell me some invalid domain or they also definitely did not scare me. Because if you enter these exact characters that apparently are invalid, it does resolve to my server. So when I looked at the GoDaddy web interface, it also showed these three characters, the purple hard white flag and transgender symbol. It's simply not the domain that I had entered into the Emoji domain search engine. Wasn't just their web front that doesn't support it. And that is caused by the wonderful zero with join us to avoid having tons of similar emoji, each with their own code. Many emoji are created by combining others. So you have the skin tone modifiers, for example, or the country flags that are a combination of different emoji with a zero with joiner. The transgender pride flag is a combination of a white flag and the transgender symbol with a zero with join in between. And the thing is puny code does not really support them. So it was simply just dropped during conversion while I bought my domain. But that's not everything. Because I still had this project, I still wanted emoji domains. And my interest was peaked. So I kind of wanted to try out what else I could break to avoid spending even more money on this project. I chose to move to test my testing to subdomains, which was a good idea because I have way more controllable subdomains than I have over regular ones. I can register them with any registrar. So I could use just my go to registrar. I can register whatever strings I want. So even invalid puny code, I can register them until TLD that doesn't allow it because it's not a second level domain, but a third level domain. And yeah, let's see what browsers do about that. So I created the subdomain transgender pride flag dot dysphoric dot def. Firefox converts it to X and I'm not going to say all that. Chromium converts it to a different string, which if you plug any of those into a converter, it will tell you that both are invalid puny code. However, both are understood and routed. So I just simply added an all route to my reverse proxy so that both would work. If you use Dick, which is a command line tool that lets you look up domain records, first of all, it doesn't do the puny code conversion at all. So I had to use one of the strings that one of my browsers gave me. But when I use that string, it also gave me this. It's not a valid IDN8 2008 name. Disable validation using these two parameters also didn't tell me that I needed both. So I added the first and then, oh, you still need a second. But whatever. Once both were added, I was able to get correct results. And yeah, my site was reachable. The next thing I thought of was what if I moved my domain to a non-supported register because as I just talked about, namesheet does not actually allow more G domains. And I was interested to see how their web interface would handle it. Sadly, it simply did not handle it at all because they don't support WS domains. I wasn't really going to contact their support team to try and still get it because this was only a simple thing that they were probably just simply not interested in hosting that domain because it breaks their web interface if you try to or other things about emoji domains break their web interface. So I don't really see why their support team would actually be on my side here. Oops, sorry. So what about email? Because apparently email clients really enjoy breaking from my experience. At least, do they break with emoji? When trying to add an emoji domain as a sender, my mail server actually broke because validation was run after punicode to unicode conversion, which caused an uncode exception, which was surprising. It's already fixed, but the patch is not released yet. So I couldn't yet test it. But that's the local part, which I could already control, as much as I wanted to. And it led to mixed results. So Thunderbird simply ignored it and just showed the punicode and Apple Mail dropped the zero with Joiner. And also showed the punicode under the thing where it shows the exact domain. So mixed results, nothing too spectacular, no exceptions or crashing clients or anything interesting like that, sadly. What did I learn during this? Well, obviously, emoji domains are very buggy. Implementation is varied from browser to browser, so you can have the same input string and get different punicodes out of it. So testing in just one browser definitely is not enough. Well, it never is, but here, especially it isn't. And you may be able to buy a domain that won't work as you would think, which can cause quite the annoyance. But it's still not a fun to mess around with this stuff, just not for productive use. I like to end my talks by telling people to join a labor union. It doesn't have anything to do with this, but that's what I do for some reason. And I've got also a blog post about this where I've written it up and I will publish the slides under the wonderful domain, a proof emoji, code dot WS. It's just a link to my regular blog for now. Yeah, I'm sorry. I think I went a bit fast, but I still thank you for your time and I'm open to questions. I'm online. Oops, I'm sorry. I'm awfully sorry. My machine is slow. I unmuted myself about half a minute ago. Thank you for that beautiful talk, Genoa. I had to grin a couple of times because it was great and it made my day. And actually we have a question. The question is in German. I'll say it in English. Why is DNS SICK so complicated for emoji domains? Well, because no one actually really likes emoji domains except the people who sell them. At least that was my experience looking up things for that. So they are kind of disallowed in the standard, but just some top-level domains just ignore the standard and still let you register them. And it's just something that people who implement things don't want to think about at all. So I haven't actually tried DNS SICK, but it's just something that is easily forgotten because it shouldn't actually exist, which may be a bit harsh. You remember the ringtone fads when smartphones didn't exist yet? Is this just a fad like this ringtone thing and it will just disappear within the next couple of years? Or do you think emojis are here to stay? Is this serious? I think emojis are here to stay but not within domains. It was possible since 2001 kind of, but at least since 2011 when the first actual emoji domain was registered. But most domains that are popular examples already don't resolve anymore or resolve to sites that sell emoji domains. So emoji domains definitely are not much more than a fad or a nice funny thing to just look at for a bit. However emojis as a whole are such a large part of our culture. I don't think they're going to go away anytime soon because it's been more than 10 years and the downloadable annoying ringtones were popular for a bit less time I think. This is a question that I actually wanted to ask myself as well, because I run my own email server as well, and which email server software do you talk about? Do you know about supporting others? What do you use as a software on your email server? My email server is running on mail view, which is a set of Docker containers that I especially made to work together to make setting up an email server as pain as is possible for free. So I haven't actually tested any other servers. However, in theory they shouldn't actually have any issues. So the part of mail view that failed wasn't actually the mail server part. It was simply a puzzle. So in theory with another mail server it should work if they didn't also mess up parsing at some point. Somebody asked here is there a list of top level domains that support emojis and somebody posted an answer in Wikipedia. Is that correct? Wikipedia has such a list? It has, but it isn't actually correct. The list that it has is the English Wikipedia. It lists at least one domain that no longer supports emojis, which is actually some kind of big political thing where they remove support. So the Wikipedia list is not complete or it contains too much. There are however registrars that are specializing in emoji domains and those will have current lists. So I had WS as one of them. It's not the red heart emoji though because that is invalid puny code and so I don't really know what to enter in my UI. I'm about to get to them other than searching it on Google. Next question is there a difference between a single puny code and multiple emojis chained together as a second or third level domain? So it's just different puny code depending on how many emojis you have but theoretically the implementation for this which is I think the technical term was ask it to a unicode or something which is like an algorithm to convert it does handle multiple emoji similarly to or it should work without any issues if one of the two works. Are there any emoji first level domains? No, they are not. They are puny code first level domains because there are languages that that simply do not use the same letters as English does so puny code first level domains are existent but no emoji first level domains at this point. Maybe there will be but I kind of doubt it because the people in charge of this emoji domains are kind of an eyesore to them from what I could read. So talking about eyesores I always have the impression that at least to the old coders diacritical signs in themselves were considered an eyesore. You know the little funny little dots those German speaking people have on there don't talk about the check and the polls. Now my name contains such a diacritical sign my first name is André and I've been fighting with all kinds of inputs that say seven bit ASCII and nothing else here. Do diacritical signs still break domains? They should not because they are actually the reason why IDNs exist so it was actually proposed by someone who has one of those signs in his name and probably just wanted the domain with his name. So this was the actual reason why we have puny code in the first place and supporting emoji was kind of an unwanted side effect. So in theory it should work without issues but still many people don't think about it enough when implementing their own thing so you can never be too certain that it will but it should. Somebody posted here that somebody obviously runs on Windows and says in Windows 10 the emoji menu comes with the combination of the Windows and the full stop. Is that common already or is that new? I think it's common by now. It's been implemented and ever since that everybody's been using emojis and there's also a remark here that says MS Outlook adds that actually a pretty good puny code support but still don't try emojis. Yeah I remember there's the story about when the Yugoslav war especially the one in Bosnia broke out. There were about 100,000 Bosnians that fled to Switzerland and about 50,000 of them were granted citizenship but they couldn't be registered in the citizenship register because that only supported 7-bit or 8-bit ASCII but no diacritical sign of Governor Jimmich. I think they fixed it by now but that was quite a thing some years back. Yeah I see no further question. Oh yeah there's one excuse me that came in right now. Is there a uniform way to generate puny code over multiple platforms? Mobiles do not work well with entering unicode numbers as we all know. I'm not sure I understood this correctly but the easiest way that that I used during my testing was simple online converges that would work on every page and actually my system doesn't have a shortcut for emojis so I would always copy and paste the emoji from Emojipedia into an online puny code converter and just use it from there because I don't actually use emoji that much. Okay we have come to the end of our time we still we'd have another minute or two but we have no more questions and thank you in the meantime for coming and holding this talk you'll have another talk I think it's tomorrow.