 Okay. So, hi everybody, I'm William Molinari and I'll talk about the open source behind of every quest. First thing is, I'm Brazilian. So, I'm not an English speaker thing, you know, we tend to destroy everyone's games on the internet. So, that's it. We have a lot of Brazilian here by the way. So, we would conquer the world sometime. And I have 100 plus slides and I have just 30 minutes to say everything. So, that would be fast. Yeah, I tend to speak fast. So, sit tight and let's go have some water here and I'll need it. So, I'm William Molinari. I'm also known by this weird nickname called Potex. Yeah, that's something from Diablo 2 and some 10 years ago. I work for Local App. So, thanks Local App for having me here. They are paying the expenses from Brazil to Brussels. I'm also the mayor organizer for Sao Paulo Reusers Group in Brazil. It's the biggest Reusers Group in Brazil with some meetings, it's a kind of cool thing. And I tend to do HTML5 game development in my spare time just for fun. So, you can check it out if you want to. That's it about me. Let's go to the motivation about this presentation. And the main motivation is that I wrote a book about that in Brazilian Portuguese. So, yeah, it would not be so useful for you, but I tend to translate that to English in the future. So, just tell me if you think that is a good idea. But the book is about what happens in a web request. A lot of things that's happening from the operating system, and networking, and I don't know, server frameworks, and that kind of things. And so, I focus on desktop part, and I'm not going to so deep on server side. But that's it. So, let's start with our user. I'm a big fan of JR talking. Yeah, I'm using some t-shirt of Gondor. And I'm using a lot of his character for that. So, let's imagine Gandalf as a user, and he's trying to access my personal blog. I don't know why. But he's trying to do that. So, the first thing that will happen, we'll use a browser, of course. And we'll use Chromium in this case. So, it's the first open source software here. And the first thing is to type a new URL there. And the first question that Chromium will ask is, is it really a new URL? And it may look silly, right? But why we will not type a new URL in the address bar? And you'll know that you can type whatever you want in the address bar. And you just search for a search engine for you. In this case, if you use Google to search for this, we will be RBR. So, but we will assume that that's a new URL. And we can continue to the next question. That is, is it in the HSTS list? And by HSTS, I mean strict transport security. That Chromium has a hard-coded version of a list of websites that can be accessed via HTTPS, firstly. So, we will ask that if the URL is present on this list. And every server can send a strict transport security header. So, it will be included there, dynamic. So, there's a lot of links here. So, it's a link for Chromium if you want to look to it. And in this case, we will continue with HTTP for a while. The next question is, does it have cache about that? And we will assume in this presentation that there's no cache anywhere because it will be so easy, right? Just return the cache and thank you. But so, we will continue without caching. So, in this case, there's no caching. So, browser, just continue. Caching can be done via XPire, so cache control, or ETag, a lot of HTTP headers here. So, there's no cache we can continue. And browser will split the URL into three different parts. Protocol, domain, and path. We have to continue through domain because the internet doesn't know what is POTX.com is really mean, right? We have to use IP address and this kind of thing. So, we have to get the IP address for POTX.com. So, it will check if we have a DNS caching for that. So, Chromium has its own version of implementation of DNS. You can check architecture of open source applications book. There's a chapter about that. So, Chromium has its own implementation of DNS so it can go faster. But we will not continue with that. We will use getADDR info that there are two main implementations with internal implementation and getADDR info using the operating system. So, we will use getADDR info so we can get into the operating system right now. So, by operating system, I mean POTX and glibc, right? That's gnulibc. So, glibc will look to your ETC hosts. So, you may have defined your DNS in your ETC hosts. So, you have this domain points to a different IP address. So, it will not go to the real DNS. So, it will just use your ETC hosts. There's a link for the implementation if you want to look to it. So, let's get ADDR info. That's it. I'm not going to deep dive on the code because, you know, it's boring. Sure, you'll have that in the presentation. So, what getADDR info will do is to, firstly, is search on a name server's caching demon. So, in case you have, this is another layer of caching. So, you may have a cache for your domain resolution. So, it will ask to name server caching demon. Obviously, we are not using cache in this case as well. So, the big picture is we have a user. We are getting to the operating system to live to the internet, right? But to live to the internet, we will have to go a little further to a little bit of theory, to Aussie model. And I don't know about you, but every time I talk about the Aussie model, I get one of those caps in my head because it's so academic, right? So, boring to talk about that. So, let's try to simplify this a little bit to not be so boring. Let's remove those two layers there. And we use just five layers here. So, we can, I split that into the operating system and the user, and the user part. So, we have application for the user and those other two layers for the operating system. So, we can, now we can see that as protocols. And we will look through each of those protocols one by one. So, let's get started with DNS. As I said, just a parenthesis here. Chrome has its own implementation of DNS. You can check it out if you want. This is the link for the criminal source code. And getting back to the GET ADDR info, we have two syscalls here. A call for socket and a call for connect. There's a search code. So, we are using UDP, IP, socket de-gram from UDP and IP protocol from IP. We are using UDP, IP and just attributing an IP address via connect. So, the connect, we just get this IP address and attribute to the socket. So, we can check that with strace if you want. Don't worry about the whole code. Chrome has a lot of processes for rendering, for tabs, for extensions, for a lot of things. So, I'm just getting all the PIGs there. Let's focus on this trace there. So, we have minus F to get all the threads and we're just using some syscalls there that socket connect. So, what do we have here? We have a socket for IPv6 with UDP, IP. So, we are trying UDP, IP version, trying to get UDP, IP. Trying to get the IP via IPv6, sorry. So, we are trying to connect and I don't have IPv6 configured in the laptop. So, network is unreachable. So, it tries IPv4, PF, IPNet with UDP, IP to get the DNS. Just, we can see that we have here IPv6 for Google DNS and IPv6 for Google domain resolution from IPv4 and IPv6. So, we get UDP, IP and we get the IP from my personal blog. And so, we get the data that we want. So, but why he's trying IPv4 and IPv6? I didn't ask for that. And the reason is that we have this algorithm called happy eyeballs working behind the scenes. And we have the RFC for you. Who don't want to read an RFC on the Sunday, right? It's so cool. So, that's it if you want to read it. But what happy eyeballs does is just request both entries in the DNS. So, we have A and for A's IPv4 and IPv6. And so, you can request, do a request for both application. IPv4 and IPv6. And you just use IPv4 in case IPv6 is not working or just use IPv6 if it's working. And why we need that? We are still using that but we don't know. It's because we are trying to transition from IPv4 to IPv6, right? And it's not all IPv6 connections are fully working. So, we are doing that to use it when it's available. So, we are not penalizing our user because of that. So, that's it, happy eyeballs is helping us. Curl implements that, Firefox, Chromium, a lot of the new Macs implement that iOS. So, that's it, it's working. So, let's get to the UDP IP in this case. We are in the DNS yet we are not to the HTTP yet. So, UDP is implemented inside the Linux kernel. So, this is the operating system, right? UDP implemented inside the Linux kernel. I'm not a kernel hacker so I'm not sure what is happening there but this is the first file that you can look if you want to understand some things. And what is happening with UDP is that we just attribute an IP address and send it to the internet and you don't know if we will receive a response and a confirmation about that. So, that's UDP, we will get to TCP soon. So, that's it, we can finally go to the physical layer but let's let it for some time from here soon. So, the DNS, I did this with DNS tracer. So, what's happening with DNS here, domain resolution? We are this user, we will ask our recursive name server in this case, in my case it's my personal router at home. So, there's no cache, remember that, there's no cache. If you ask one of those two, those 13 root servers that will send us the IP for one of those, for one of those two top level domain servers, in my case it's POTX.com. It's a top level domain for .com that I'll get my alternative server and I finally get my IP address that's 192 up there. So, that's it, we finally got what we were looking for that is my IP address and we finally can get to the HTTP that is what we were looking for. So, let's get started from scratch. So, we have TCP IP with HTTP now and let's get started with TCP first because it will be better to explain. So, yeah, TCP is implemented inside the Linux kernel. So, if you can look to the TCP-C if you want to. And TCP is really different than UDP, right? You have a lot of control and statuses and yeah, we have a connection here, not just sending a package, but we will not go into detail of that because it's boring, right? So, let's look to the three-way handshake and how we establish a connection. So, we'll use Gandalf again for this case. What's happening here is that we have a server listening on the other side on port 80, for example, my blog is running port 80. So, Gandalf in this case is the operating system. We will send a sync package to that to the server. This is the first step. So, it tries to connect to the server and the server say it's nice. I want to expect a connection. I want to connect to you as well. So, it sends an extra confirmation as a confirmation and send to the other side. So, this is the second step and as a third step it just confirms. So, we have a connection from Gandalf to the server and from server to Gandalf. So, it's a full-step connection and we have Gandalf happy, right? Yeah, that's it. So, now that we have a connection we can go to HTTP and HTTPS. We'll start with HTTPS because we have to have an encrypted connection here. So, HTTPS is just HTTP over TLS. So, we have an encryption tunnel between quotes here. So, it's just like that, just like between HTTP and TCP IP. So, it's based on this article, it's a dense article if you want to read it. Okay, let's imagine that we already have a connection here. You can remember that's Gandalf happy. So, the first thing to do is to send a list of ciphers and a URL from client to server. And by a list of ciphers I mean the thing. You can check your ciphers if you go to this website. But there's a lot of encryption algorithms and key exchange algorithms and yeah, that's it. A lot of things we are not doing in detail here. The server will pick one of those ciphers and send back with the certificate. So, in this case we are using TLS RCA with RC4 and yeah, that's it. And what we can remove from that is that we'll use RSA for asymmetric cryptography, RC4 for symmetric cryptography and MD5 for hashing. So, we use public key and private key here to exchange the key. Symmetric cryptography, we use this key to exchange information and we will use hashing just to compare packages and see if everything is okay. So, that's it, just a second. That awkward moment. We will check the certificate. Because certificate is solving a trust problem, right? We don't know if this guy is Potex.com for sure. So, we are asking a 30-party guy if he's really Potex.com. And the certificate solves this problem. We will just check the sign of the certification with one of those public keys that we have on our laptop, for example. And if we sign in, the sign is the same as the 30-party that's signed with the private key, we can be sure that this guy is really Potex.com in case we trust on a 30-party there. That may be, I don't know, very sign or less encrypt and I don't know. So, we will check if we have a trustworth CA certificate authority. We will look for a valid date in case the certificate is not expired. Expected URL and we will have to check if the certificate is too valid. So, we will have those checks. And so, we can do a master key exchange. And there's a complicated process going on here. I'm not going into detail, but there's a rental number going from one side to another and a lot of, I don't know, cryptography, those algorithms will go in action here. So, we are not going so deep here. But there's a big key on both sides. They are generated on a secure way. So, finally they have a trustworth connection here. And it will use RC4, a symmetric cryptography, just like, I don't know if this is a good explanation, just like your Wi-Fi will have a big key and they will use a symmetric cryptography for that. And MD5 for a content verifying, so every package will be verified with MD5 in this case, right, in this cipher that we chose. So, that's it. Now, we finally have TCP as a connection and TLS as an encryption connection as well. So, we can use HTTP or ATTP2. ATTP2 is the best one, I think. I would recommend you to use ATTP2. ATTP2 is really cool and is the future. But we will use ATTP just as an example because it's plain text and it's easy to show there. That there's a link for the Chrome implementation for HTTP and ATTP2 that's called Speedy on Chrome encoding, if you want to look at it. Okay, so this is the HTTP that there's no rocket science here, right? It's just a get here and with the file that we want and the protocol number and name and number. And we are using host here just to, because I'm using GitHub page for this case, they may have a load balancer and you'll have to use that. But that's it, there's no rocket science, that's ATTP. And so, we finally have that and we go out to the internet. So, we use internet or Wi-Fi and they have the physical layer. I'm just, that's not exactly like that, but we'll have those two layers down there. So, it's based on a paper. So, let's go inside the Linux kernel, between quotes. So, we have a socket library there, just like GLEBC, for example, that will talk to the TCP IP stack inside Linux kernel, that will talk to the frameworks and drivers developed there. So, we can finally go to the hardware to leave the operating system, the machine, by the way. So, this is a paper, I talk the last slide, so if you want, you can read it if you want. So, we are going to kernel on IPv4 TCP-C, there's some frameworks to develop device drivers for wireless and that. And we have IDWI-Fi that is the driver for this laptop. And the only part that is not open source here is the firmware on my, the Intel firmware, so that's it, it's a bomb. So, everything else is open source. So, we are finally leaving the operating system and we will go to our router to the internet. I'm not talking about the router because I will not have enough time for that, but let's go to the internet. And by internet, I mean just a trace route because there's a lot of devices out there and you can't study that. So, let's imagine a trace route with TCP because we lost a lot of packages if you use UDP here. So, that's it, I'm going to my router that is volunteer. Yeah, you know, it's another thing from Lord of the Rings. Yeah, so, you're going to the virtual that is Brazilian ISP, we're going to the backbone that's go to New York City probably in YK. It's going to USA backbone and GitHub base, in this case. That's my personal blog. So, that's it, let's imagine that's the internet, right? Just like IT crowd, just that little box. Yeah, that's the internet. So, I'm not focusing on this part, the server part because we are in the desktop of that room. So, I removed a lot of slides here. I was studying frameworks, studying a lot of things here. So, I just removed that. So, let's imagine NGINX, NGINX may have a configuration for UNIX sockets and for TCP. So, in this case, it's for TCP and this case is for UNIX sockets. So, I can check the documentation. And we imagine that NGINX is talking to a web application server. So, we are talking via TCP to this application server. We can use TCP or UNIX socket in this case. We just chose TCP because of reason, right? So, it just shows that. So, Unicorn will talk to REC that will talk to Rails. In this case, it's just a Rails setup, right? We don't have to be familiar with Rails but what REC does is just to make the protocol between Unicorn and the Rails app. So, you can change your framework to CNET, right? And your Unicorn to, I don't know, Poma and keep the same protocol there. So, this is why REC exists in this case. Another good question here is that why I have to put a web server instead of just using Unicorn, a web application server to the internet and the reason is that Unicorn is well-prepared for that, NGINX is well-prepared for that. Unicorn is just implementing HTTP but it's not so, I don't know, it's not a good guy to be on the internet because, you know, internet is difficult. We have a lot of Brazilians there. So, what Rails we receive here is just Ruby hash. With a lot of things here, we have CGI 1.1 there. We have, I don't know, this is what that's of local host so I was testing here. HTTP 1.1, web brick, that's the web server, application server. That's it, so we have a lot of things here and Rails will receive that and we'll try to understand what we want with this URL so we try to understand what is a controller and what is a model, what is a controller and what is a view. So, let's imagine the MVC here. I would change a little bit just to get to the Rails controller so we get some data from model, build a view and we'll have an HTML here. So, this is my personal blog finally. So, we can finally get all the way back, all the way back until combo brick, until we get to the user. So, we will not get to the user right now because it's just plain text, right? We don't want to show plain text to our user. I don't want to, I don't know, understand the HTML for that. So, that's it, that's the time when you have a spinner there in your browser is just receiving a lot of information that is HTML and it will have to parse that and build the web page. So, here is where the CS algorithms that we are learning the university comes to life, right? Because we have to build the don't tree that's document object model tree. So, based on a lot of text there. And so, there's a link for implementation for inside Chromio and inside Blink that is the rendering engine for Chrome. So, that's it, Chromio, we receive your HTML and fix a lot of errors for you. So, I don't know about you but I never saw a HTML syntax error. You can't see your Facebook comments because there is a HTML syntax error. That does not exist, right? And that does not exist because browser is doing a lot of magic for you. So, that's it, thanks browser. We had more black magic in the past when we have IE6, you know. It's more magic than we wanted. Just like Gandalf in The Hobbit and a lot of it. Okay, so we have a lot of tags here. So, we are receiving images and scripts and CSS and this kind of thing. So, rather than start downloading that, every image and every link will be downloaded in different thread, by thread I mean asynchronously. So, that's been downloaded asynchronously. And every time it finds a JS file, it will be downloaded synchronously. So, it will block your whole rendering and download the file, execute it and so it will continue. So, that's why you have to put your JS file at the end of the page. So, that's the reason. And just remember that every one of those guys is a Gandalf happy thing, yeah. So, you'll have a lot of, I don't know, three-way handshake and TLS thing. I know that's keep alive to save the day, but in theory you have a lot of things going on here. So, that's it. And by using ATP2, you will just use one connection and multiplexing through this connection. So, use ATP2 is the future. So, we'll have to do the same thing with CSS. So, we have the DOM tree and the CSS on tree. So, we have to do the same thing. There's not just text, we have to create this tree. And so, we have those two trees and we have to merge them. So, when we merge those two trees, we'll have the render tree. And based on the render tree, we can finally build the whole application, website in this case. So, there's a good article about that, that's a master thesis, I think. So, just read it if you want. And that's really interesting. So, you will find a render tree and this render tree will have x and y. So, you'll have to drown this box with x100 and y100. So, browser will keep printing that on your display. So, finally, we finally have Gandalf in this case. We finally have my personal blog so I can read about Go and Ruby, I don't know. So, that's it. And as a summary, is that OpenSearch powers the web. We have a lot of OpenSearch software out there and that's why we have a, not that's why, but it's powering the web, it's really cool. And yeah, that's it. It's really easy, right? Okay, so, I hope you enjoyed this presentation and it helped you to understand what's happening on our web request through OpenSearch software. And I'll be waiting for a question if we have time for that and thank you. That was fast.