 Okay, so I don't I can't see my next slide. Okay, so I know it's not a big deal You probably you know if you've done presentation before you might know what's coming next, but I Average between 250 and 300 slides So knowing the next one is kind of handy and I won't be able to do that today. So Please bear in mind with me So I'm going to start Hopefully it's it's all of us So my name is Daniela. I work for a company that's called learn kubernetes We are based in primarily in Europe. So London and Milan and just recently in Singapore But I'm not here to talk about what we do. I mean we do training for kubernetes That's basically what we do But I'm here to talk about a little bit of story that started in a year ago so a year ago I was here in Singapore actually and there was a really interesting tweet on on Twitter and it was from a guy in Japan Manabu and I can't read Japanese and I don't know if you can Good for you good for you really good for you but it basically says what if I take a kubernetes cluster and I tamper with the networking Will kubernetes still work? Can I still use my cluster or you know, it's gonna go and fall apart So we we basically took that as a challenge as we said, can we actually do that? Can we actually take whatever Manabu has done and replicate the findings that? If I'm on that time Now before we dive into what we actually did a little bit of a recap of what kubernetes is and and how it works So usually what happens is we have a collection of servers like this one so those can be on-prem service those could be virtual machines and As engineers or DevOps then the challenge is how do we actually manage a lot of them? And then how do we manage them efficiently? so One way it's not the only way But one way is to actually use kubernetes and and the way it works is we have a master node Which is going to receive all the commands and then we actually have the other nodes joining the master and then when that happens We call that a cluster Okay, and then knows I mean in this pictures. They are all the same But it actually could be of different sizes the only things that we actually care is the memory and CPU of those nodes So as soon as you add nodes into the master into the cluster Then you're going to add memory and CPU to the overall memory and CPU of your cluster So that's all kubernetes is it's basically just merging all of your servers into a single machine Okay, you can imagine having kubernetes as your single VM in the data center But why would you do that? Well, the reason is pretty simple now You don't need to deploy to a particular server anymore when you deploy you can just Deploy directly into a single machine So I just say kubernetes, please deploy me this is application four times. This is going to create Four application inside my infrastructure Then because kubernetes has got you know this layer of abstraction between your serve and the data center They can take a smart decision. You can look at the nodes and say actually, you know, I'm going to place the application like this and That's basically the beauty of of what we see with kubernetes And then this kind of design leads to some interesting results The first one being well if kubernetes can actually analyze my infrastructure and take care of it on my behalf Then when a node goes away kubernetes is smart enough to actually move that application to a node that is available Okay, so we said up kubernetes, please give me four applications It realizes that one was gone The node was lost and just rescheduled that application into a new node And that's great. So we've got reliability is built for full tolerance and everything works Now if you are deploying so this is this is a simple example if you're deploying inside a Cloud provider you probably have a load balancer and On top of your nodes and then what happens usually the node goes down The load the node is detached from the load balancer and then you would reschedule that application somewhere else That's basically how kubernetes works and and why why we find it so useful But um the other interesting things for kubernetes is that you know, it's designed to scale So instead of having like three nodes we hadn't we might have more nodes As an example, this is basically how we structure application is kubernetes We usually have the application underneath Then we have an internal load balancer, which we like to call service Because it's not an overloaded overloaded term at at all in confused science and then at the top We have like what we call an ingress which is like an external load balancer But we talked about scales so what if I'm really really keen to have a really really large cluster But I don't have enough application to actually feel that cluster like in this case, okay? I was a little bit too keen to scale my cluster and then there are only two application and three nodes So if I access the first node, would you expect a response from the application? Yeah, yeah Second one, you know what's coming If I access the third one No Anyone else saying yes? Yeah Yes, it is a part of the cluster Then yes, no I'm actually going to the nodes trying to access the service. What would you expect? Yes Five or two anyone else Okay, let's Is the traffic lost? Yes, is It's a timeout Yeah, all of them. Is it a four of four? Which one is it? Actually none of them it works You get a response back even if there is no application on that node Okay, so why is that? How does that work? Right? Is it is it magic? Um So the first you know when when I look at this at the first time was Is the load balancer actually doing the smart routing? So if you look at the application and so I've got two application deployed in my cluster and there is like an application load balancer for for AWS or you know an application load balancer in Azure then That load balancer actually knows where the application is so you can just route the traffic directly, right? Which is great like this Does that sound fair? It does right? Unfortunately, I don't want the logic to stay in the club in in in the load balancer, right? There will be cloud specific load balancer So it's not really something that I can take away and it would be really really hard to implement if I've got an un-prem cluster It's also a little bit of a single point of failure and and leave it hard to scale depending on what you do You can imagine that if you're doing an un-prem data center Then scaling a component like this could be quite problematic plus you need to sing those rules somehow So this is not what Kubernetes does. This is not how we route trapping inside the cluster The other option right could be what if we actually route all the traffic to the master node first So if the master knows knows everything about a cluster So it makes sense to go through the master node the master node is going to tell me where the application is deployed and I'm going to route the traffic to that node Fair enough Like this Does this work Well, it's great, right? I've got someone say no It does work right So It's got a vendor agnostic. Yes, it's going through the master that doesn't belong to the cloud provider We solved the problem, but it's a single point of failure and it's really really hard to scale right? How would you do that? How would you scale those master nodes and you need a lot of master nodes and Imagine so we're talking about if you if you're deploying on GKE, which is the Google cloud platform We are talking about 5000 nodes. How many master nodes do you actually need to support 5000 nodes? There should be exactly but then we start having all this sort of conversation. It's really complicated This is not actually what happens in Kubernetes The first idea we had wasn't too bad, right? So this load balancer that knows everything and can route the traffic That's actually a really clever idea. The only problem we had it was that it was outside of a cluster and not inside So what if we actually move then load balancer and we break it apart and we have one in each node Okay, if we do that That's quite a clever idea because if we go and we have traffic going directly to the node We can go through we can go through again The next time we go to the load balancer. Well, if it knows all the routes and just say well This is not mine go somewhere else Okay, this is actually what happens in Kubernetes, right? We have on each node. We have a component It's it's actually what I have in Kubernetes. So you can see it's got Cloud vendor and Gnostic redundancy built team because the more node we have we add the more load balancer We're gonna have and in scales with the nodes as well So the component that does that in Kubernetes is called cube proxy And it's a binary which is installed in each of every node in your cluster and that component is in charge of setting these rules in the node itself and But but the question is how does this cube proxy know the roots, right? I thought the master would know the roots. So is what is your props doing? So it turns out that when you ask for a deployment to keep an anchors. This is this is what happens So we ask for deployment for an application then the request goes inside Kubernetes The first component to receive the request is the API server the API server stores the request Into into the database the database will store that request for you And then there are a series of components inside Kubernetes that will analyze the status of a database and create the application For you and there is a third component Which is the scheduler which will look at at the or at your application will make sure that Any of the pending elements are going to go and be scheduled At that point what happens is we've got your we've got another component inside Kubernetes So next to cube proxy we've got the cubelet, which is like a glorified agent The agent goes and asked the master node if there is any update So any application that should be deployed on that particular node if the funds one then is going to delegate The creation of the application inside a node so when that process happens then The application is created an IP address is assigned and that IP address assigned is then returned back to the master node So you can imagine that the master node Has got all the information so when he created the part when he created the applications what it did We had like a table like this We knew where each part was because we have signed that with the scheduler to the node Right, but we didn't know the IP address at that point in time now because the cubelet created the application And and send back the message to the master node now. We also know the IP address assigned To that particular node. Okay, so we have all the information inside a master node So the list is always up to date So what that means is that when you add an application then the master node will be updated And when you remove one when you are another one is going to be updated as well when you remove one It's going to be removed as well So that happens every time you create or delete one of your application inside your cluster So that's not the only list though. There is another list Which is to do with what we call in what we call service, which is basically an internal load balancer So the internal load balancers got a list of IP addresses as well So in reality, we've got two long lists inside a master the first one belongs to the pile to the applications and the second one belongs to Lot balancer basically. So what kind of IP addresses am I routing that traffic to So those two lists are quite handy, right? But they leave inside a master node. So qproxy doesn't know anything about it. So what happens is Is that qproxy will ask for this list to the master node and set up Routing tables on each and every node inside your cluster. And that's basically the magic behind Kubernetes. So what happens when so we could finally answer the question What happens when I hit the node and there is no application? And the first thing which is going to happen Qproxy will read the routing table and say hey, there is nothing here. Go somewhere else and somewhere else is actually Looking at a table looking that there is an internal load balancer The internal load balancer points to either port one or port two. We look at port one and port two We take we take the IP address we find every route request Okay, so that's what happens in Kubernetes every time you hit a node now That was only the intro for my talk. Okay, so now, you know a little bit of our Kubernetes works how the traffic is reading inside your cluster and and you know when We're ready to be ready to actually break things right so now that we know how it works What happens when when things go wrong? So the first scenario is That let's say we've got Somehow we are a little bit unlucky and then one of the nodes the one on the left Cannot connect to the master anymore What's going to happen? Yeah, so because the The service on the left is not able to connect to the master It won't be able to update the routing table Okay, so if any request comes in they're still going to use the old routing table Which you know Could be good could be bad You know depending on how many pods how many application you created and destroyed in the in that time It might still go to an existing application or it could might be to fail Whereas the other nodes will work as usual right because they can contact the master They can download the list and they can continue working So that's What we see when when there is a something like a spread brain It almost works kind of good And the the disadvantage is that it could be a routing a stale routing table And eventually when when when the when the network recovers and we have a single network then that Node could contact back the master node and just get back the new list So the good news is that eventually kubernetes will converge and then fix everything by itself Good news I guess So the second one We start breaking things. So the second one is actually what if I kill That process inside a node, right? So we talked about cube proxy is a is an agent that it's inside a node But what if I actually go and stop that process? What is going to happen? Well done So if a new pod comes in and we don't have cube proxy That's the routing table if a new so sorry if a pod crushes Well, there is no one to update those tables, right? So it's going to stay the same So we get similar scenario to what we had before So for the existing pod is going to work for some others it won. So it's basically Not great, but it still works And but like you said, it's it's a demon set. So in kubernetes We've got something called the demon set which is basically just a fancy way for saying that when it goes down When it crashes it's going to be restarted Okay, so that's how kubernetes is going to solve the problem for us And yeah, it almost works. So we've got the stale routing table and We can kind of make it work. So that that's basically number two So now let's go even more Deep into this what if you have not necessarily good intentions Okay, and let's say that The Routing table is lost As in you go inside and you tamper with the routing table. Okay So you change the values or you remove the values without anyone noticing it And that's basically what? Manubu from from japan did so that's what we found out when so when we published this article last year And he basically described this situation where You go inside a node You actually tamper the routing table and then you see what happens So this is the scenario that we're going to analyze today So we've got one single node one single load balancer And then the plan is very very simple We get into the node We drop the routing table And we see what happens Okay, so before we move forward What's your guess? What is going to happen? Spread brain scenario. Let me see. Oh I want I want to do spoilers What is going to happen? Okay, okay, so is it going to work It's going to recover We've got a lot of faith. I like that anyone else continue work just because you said it Anyone else saying that it's not going to work Okay, we'll see so what I did what we did is So we took that back to the team and then we just replayed everything that manuba did and Well, the story is a little bit more complicated. I'm sure there is Romans somewhere as well. They're going to make a movie. But in essence, this is what happened So we basically set up a very simple loop In bash that what it does is every second Is going to print a date And the value that we get back from the application Okay, and the application is quite simple. It always replies with hello world. Okay, so this is what happens Okay, that's quite easy. Okay every second we see a response from From the application And then what we do is we are very very nasty. We get inside a node. We drop everything Okay, and then we observe what happens So we expect nothing happens nothing nothing Oops Nice, I mean There is a little bit of a gap, but I can sort of deal with that But it works Okay, and then but if you look at numbers So it's 47 up there. It's 14 down here. That's probably 27 27 seconds What happened is 27 seconds And I don't know why why 27 seconds? Why not, you know, five or 10? So It's about 30 seconds. Okay, so we can leave it that So maybe something is Going on with I don't know maybe load balancer, right? So we have two many things to You know analyze so the easiest things we can do at this point is to actually remove things from the equation So we have a load balancer at the top. We just remove the load balancer and we repeat the same scenario, but this time We got direct to the node Okay, so same loop this time. I'm going to curl the ip address of the node And we try again And this is what what we see. So I see hello world We drop the routing table Nothing. Whoops Time out time out Back Okay So why this time we see the time out and then before we didn't Okay, when we go directly to the node and we use curl then we see the timeouts twice And and then it goes back up A couple of interesting things here first is that the curl times out the 10 seconds So this is making me think basically maybe the load balancer took a little bit longer to time out But but at the time it took you know to time out the the class of some reason recovered and then we didn't see that time out happening And so that load balancer must be having a timeout, which is greater than 10 seconds Then the other weird thing is that someone is fixing the routing table for me And then the last one is Why for a second? Why not 20? Why not 10? What's going on? So you might have guessed that This is uh, this is sort of cube proxies Fault, okay, or or well I guess fault. I guess it's probably the right or is the right word, but what what is happening? Well It turns out that these rules That are set up on each and every node. They're actually synchronized on a on a regular interval So there are two flags as you can set to actually make sure that Your routing tables are refreshed frequently enough So if something happens to these routing tables like me going inside and dropping the routing tables Then it can be fixed in in in a proper manner. Okay, so these are for example two flags. So the first one being Are often so this is 30 seconds and then what's the minimum time after every refresh? So this is to make sure we don't do too many refresh at once and we overload the master node Okay, so these are two two mechanism that we have to control this routing table on each and every node um so If if you well lost a little bit why we're doing uh the the curl This is what happened basically just there's a visual representation. This was our setup This was the routing table And what we did was we kill the routing table So remove the routing table from the node and we made a request that request Couldn't like it would go through the load balancer the load balancer will forward that request in the node But because there is nothing listening on the node It would just You know wait until until the time out by that time qproxy will recover Will refresh the routing table that will come back up and then finally You know because there is something that can accept incoming connection And it will go through And then finally will reach the application That was what we did with With this simple test Okay, so these two flights you can tweak them. They are part of a cube proxy and That's basically everything I wanted to show you today I've got a couple of things that I learned in the process I think I really you know something that I want to share with you today So the first one is um, so and this is perhaps the At least the most interesting for me is that We originally look at this problem a year ago and By that time we wrote a blog post I took this same Talks to another conference and now I'm doing it again And every time I give it again I basically find an error of what I explained and I need to go back and rethink about how kubernetes works Now this is just to give you an idea that things inside kubernetes are quite complicated Okay, uh, they're quite complicated for a good reason because we want to build the product which is reliable But at the same time, uh, I wish we could do better with You know documentation and explaining how things work now. I showed you a very simple example of networking in kubernetes But this is what it looks like With the nitty gritty details of ip addresses and routing tables So the same as before so it might be a little bit overkill, but this is what happens So you go inside a node you're actually requesting for an ip address which doesn't exist That's another thing weird things about kubernetes kubernetes will kick in We'll read that request we'll realize that the ap address doesn't exist We'll look at the table and say it doesn't exist replace that ip address for a part inside your network Then depending on the type of network you've got It will go through a routing table and then go back to your part Okay, it's a little bit more complicated than what you see And what you saw earlier today, but that's basically just to give you an understanding of you know How complex are things if if you start digging inside the clustering site inside kubernetes works The other things that I find really useful is article like this So this is a pretty old article, but julia evans issues She's amazing. All right, so this this kind of article really go down deep down into what Networking is in kubernetes. Give you more idea how things are actually How they work inside kubernetes So the other things which I wanted to share is that how do you get better at this kind of stuff? Well, the only way to get better is to actually try and break it try and And and become better at it. So the two resources that I would recommend is the cka practicing practice environment Which is basically just a collection of challenges And they're also useful if you're interested in in being certified And then the first one is a just a collection of Useful I would say useful challenges They are not some of them are not particularly good, but it's just You know good challenges for you to to tackle and then in practice. So I think something that I would recommend And then how do you protect against things like this? How do you protect against, you know, someone going inside a cluster and dropping the connections? Or you know noticing that you won't be able to reach a particular application because the routing table is stale Now there isn't a simple answer because it depends on what application you're building You know, what are your constraints? How you tweak the flags? Uh, but but basically all boils down to how you monitor your infrastructure and How you control your parts and and everything else So I think I want to be able to give you just one link and say this is the holy grail go and do it You will need to understand how your application works to actually fix it And that was it. Thank you very much for listening to my talk. I hope you enjoyed I've got a couple of us. So if you like to talk I wrote the same thing is as a blog post. Okay, so this is the blog post that we wrote And then it goes for the same example and you can also check out the code and try it on your own So maybe if you've got a production class that I probably wouldn't advise that But if you like to try that at home. Yeah, please do so I've got stickers Okay, so just Be careful with the sticker because the last time I did that some there was a guy who just Went there and grabbed all of them All of them and it wasn't fair for everyone. Everyone else. Okay, so be mindful of others I guess so but just just be mindful of others as well, please um Any question the question is going back to the situation one has click brain syndrome When you have No communication between the workers In the networking world The voice is kind of stuff You have a condition you have to have a redundant network connectivity between Systems all currently as I see right now the current is networking is based on the Assumptions I think to answer your question, I hope I got the question right if the question is Can I actually use a redundant network for my Kubernetes cluster or who That's Two words for what you described um, so To answer your question first So kubernetes is built around three basic rules of networking And the three basic rules are any application can talk to any other application inside a cluster Any node can talk to any other node in any other application and the third third rule is The application see itself with the same ip address that everyone else sees with okay, so these are the three rules Now if you satisfy the three rules, you can implement any type of network you want So usually what we see um when we work with We've with clients is that they ask us for A kubernetes pod a kubernetes application to have two ip addresses one which belongs to the One network which is the network of the cluster and then the second one is actually a network somewhere else So you could have like multiple networks attached to your pods. I don't think it's quite what you asked for My question was specifically related to the nodes The nodes are communicating with each other There's a loader internal node network But that's one Pretty much because you do not have a two network interfaces per node That's That's That's correct, but it's also not always the truth Okay, so sometimes for example if you're using amazon if you're using eks for example Um the network for the nodes and the network for the applications are on the same network, which is the vpc Okay, but if you take the same configuration in gke which is google cloud platform then the node exactly Exactly. So I think as long as you satisfy the three rules of your networking Then you can design a network in the way you want And eks having one and and and and gke having another so I think Depending on what you're building probably I've got very little understanding of what you're trying to achieve But I think as long as you can satisfy the three rules, you should be able to have redundancy in in the network itself So can I So actually you can as he said and he gave the public cloud example only Even in private cloud you can build multiple card and do that And the other thing So you talk about straight brain situation when they start here. So split brain situation in this case The entire thing works on the core out. So you need to have three to start here So for example, if you are When you need to have three Instances to have core out if you want to achieve And then one goes down there is way to they elect by themselves one master and they can continue to work So split brain problem here was when it was This is more active active cluster that is one thing to start with what Even if one goes down The other useful are going to let and do it So that's how this and other thing that works here is the eventual consistency he talked about Over the time things are able to settle down the entire design premises in this case is That we call for tolerance rather thought resistance So something goes wrong Other things will continue to work For some time with whatever information they have and once the other whatever went down comes back It will try to sync up and try to figure out What are the best So that's a little design A philosophical difference in here called tolerance versus for resistance And that that that's what that's what plays a role in this Any other question, okay, cool that was it. Thank you very much