 Hello, everybody. Welcome to this session on Neutron Trouble Shooting. We are from Dell EMC, and I'm Nita He here, and this is my colleague, Diego and Mohammed. So we have been working on a kind of a tanky solution based on OpenStack. It's called the VXRAC Neutrono. It is actually running live here at the Dell EMC booth. It is kind of a self-contained mini data center that contains the networking storage and compute, and for those of you who have not seen a kind of data center in real time, you may want to go there and take a look. It's kind of the rack, so it's the rack and everything you would need for running a mini data center. So I definitely encourage you guys to go there and take a look. So here today, we're going to kind of present, go through the presentation in the kind of storytelling mode. So just kind of give a quick introduction to all the actors here. I'm Nita He, and I'm the one who under the constant pressure to deliver, and also I want to hesitate to ask you any issues. And here's Diego. It's the innocent-looking Diego who is trying to remain calm in all the storms, and we have Mohammed, who is the network guru, who knows all the tricks and will have all the magic to fix problems, but he will also make fun of us by telling us the stupid user errors. So here we are today, and we are actually working on a project called the... Oops. Okay, so I was going to call it on the project called Unicorn, so by any coincidence that you have a big project called Unicorn or have some actors, some employees like us, this is pure coincidence, okay? So one Monday morning, and I woke up with this dreaded message on my laptop. So the first thing I would call is to call Diego, which is the guy on the front line. So, hey, Diego, what's wrong with my server? It seems very important to me. You know, go and fix it. You have ten minutes to fix it. Work on it as soon as possible. Right, so this is how it starts for all of us in IT. It's Monday morning. I had an awesome weekend. I haven't had my first cup of coffee, and I got this guy here, who's going to be impacted by his bonus, dump it all on me. We'll see how that goes. Cut the crap. Go ahead and fix the problem. Oh, he is here. Do we have a ticket before I look at this? Yes, there's a ticket. Every one of the details is there, so you should have all you need to fix the problem. All right, I'll get that done. So apparently, as I can see here, the site can't be reached. This whole Unicorn thing has been going for a long time, and they always blame somebody else. So let's just take a look at this for a second. Well, so the basic architecture is what we see here. He's trying to access Unicorn from his laptop. So he goes into his open stack tenant, and he's just really trying to see if it works or not. Most of the time, this is a user problem, so I really won't take a look at this right now. I'm going to show you the problem to the side. It's probably networking or firewall. It's usually those two groups who cause me problem. Hey, Mohamed, how are you, man? You're doing good? I know Lida is back again with some drama. Oh, man. I'm pretty sure I've checked all of OpenStack. I've looked everywhere. He knows that we are working in the summit here. Doesn't he know that? Yeah, so he has a ticket and all that, and I think it's somehow really networking. Do you mind just driving him and showing him what it is that he needs to look first? Yeah, true. Can I have some time just to get to know the crowd and our guests here? Sure. Hi, everybody. So he already introduced us. I just want to get to know you more. I'm going to do that by asking some questions. So please, how many of you think of them as more like network experts than Linux experts? Okay, a few hands. And how many of you think the opposite? The other side. The rest think they are experts in both? Okay. Or in none. Or in none. It's okay. So one more question. How many of you know what is address resolution protocol or ARP? Okay. Good. Good. Many hands. For those of you who don't know, don't worry. I'm going to explain a little bit about it. The reason I'm bringing this here up, because we really need to know the foundation before going into more details or more advanced stuff. And basically ARP is one of the basic TCP IP protocols that you should know. Okay. So if you have a network, you have three hosts here, host A, B, and C, all on the same layer two span. And I draw two lines under layer two here, because ARP is used for when you have your hosts on layer two. So they are on the same subnet. And host A wants to communicate with host B in this example. So host A is going to send a broadcast request, broadcast ARP request. It's going to reach host B and host C. And then host C will not respond, because it doesn't have the IP address that host A wants to reach, which is 192.168.1.2 in this example. But host B does have it. So host B says, hey, it's me. And my MAC address is this. So I send it back to host A. Now host A will start building what we call the MAC address table, or what we call the ARP table. Some people like to call it ARP table. So now host A has 192.168.1.2, which is IP of host B, and the MAC address of host B. So things will be nice. And host A is going to be a big issue. Okay, so now that I know you, let's get back to the problem. Finally. I have a good one. My patience. Yeah, no patience here. So here's Lida on the right and the unicorn application on the left. The IP of Lida, I want you to try to memorize these IPs, because you are going to see them in this case all along. So Lida's IP is 10.252.72.99, and he's trying to reach floating IP at 10.246.155.239. There's no conflict here in this IP. This IP is range 2016, and that IP is 24. Okay. So first thing to look for, I forgot to mention something. Here, in our architecture, we are using distributed virtual routing, which means that the networking piece goes along with the instance. So when you launch your instance, let's say it's launched on host A. The networking for that is going to be on host A, especially when we're talking about floating IPs. So here we have to find on which cloud compute is my instance hosted. To do that, you can do nova list, and then you do nova show the instance name and grep for host, and here you can find what is your host name. Okay, next. Here you can see an overview of what network components are involved in this kind of packet flow. So we have Lida in the external, outside of our cloud, wants to reach something in the cloud, in the internal, on the floating IP for his application unicorn. So an overview of the architecture. Lida will first hit the obvious external bridge on the BRX interface. He will then, external is internal. Here is Lida. He goes to the external bridge, and then after that, he will go to the floating IP namespace. Just a note for people who already know this architecture. We have done some planning, so we changed the FG port from the integration bridge. So we don't go through the integration bridge. We go directly from the external bridge to the floating IP namespace. It's not relevant to our discussion here, but just as a slide note. I'm going to explain more about each component as we go on, but I want you to know how the flow goes, and then I'll go on to each component. But from the floating IP namespace, which acts as a regular router, was a special functionality to do proxy ARP. We talked about ARP, you remember? Here, the reason we use floating IP namespace and then we use the Q router to the left is because we don't want to waste our floating IPs. So we give on the router, we give slash 32 subnet, and then on the floating IP namespace, we configure our floating IP range so that it responds to the ARP request on behalf of the Q routers. This helps us in reserving floating IPs. Or else we wouldn't need an additional device just to do routing, so we have two routers. Then we go to the Q router, and here we will talk about it more later. Then to the integration bridge, the external bridge, and then we can reach the floating IP of our instance, or the instance. Just to make this more visual, I've drawn a diagram of a physical network. Basically, all you can see here are on the same host, starting from the external switch, the OVS external bridge is just a switch connected to our external router. Then from there, it's connected to the floating IP router. Then to another router, it's called Q router. The Q router is the one that you see in Horizon when you create your project, and then you create your router. This is the one that you see. The FIP router, you don't see in Horizon. It's in the system. Then the integration switch, sorry, just a layer 2 switch. Then you have the Linux kernel switches, and here I said layer 2, layer 3, switch plus firewall because the reason we use the Linux bridge is just because the integration switch cannot access the Linux IP tables kernel and cannot use this. We need that for security groups that you configure in Horizon or for your project. Let's go now and see each component individually. First, we're going to start with the external bridge. For people who are not following or feel that this is very complex, just don't leave the room because you're going to learn some tricks and some commands on the way so you can take something out of this. First thing to learn is TCP dump. How many of you know TCP dump? Eight? Well, that's good. Make sure you always remember to use it and have it along when you're troubleshooting networking. I like to use some switches like minus and for not resolving so that I see only the IP address. I don't want to see the host names because I know IP address is the minus E. To show the MAC address, I like to always turn this on and then specify your interface and the routing protocol. Lida, do you mind doing a continuous ping for me? From your instance to the... I've been ping all the time since the morning. You're still pinging. You're pinging. Waiting for it to come back. Don't ask. All right. I'm going to go into the external bridge. Can you see this clear? Okay. So here. Here, as I say it into the node that I just found where my instance is located because all my troubleshooting is going to take place there. It happens that it's on a docker container so I have to go into this, into the docker container where the Nova compute is hosted. If I do OBS VS Control Show, which is an OBS command, I can see a lot of interfaces. We will learn later how to know which interface to use. Now here, first thing, because this is an external traffic, the first thing it's going to hit is the BRX interface. So if I do a TCP DOM on the ICMP protocol, I can see that I have this source IP that I showed you from Lida trying to... sending an ICP request to the floating IP, but I don't see the reply. We don't have a reply here. So next, let's see if we are... let's go to the next step. You have to close this. Okay, so next step, I have to check if the traffic is actually hitting the FG interface. FG stands for floating IP gateway. So it's the gateway that you did. And here, yeah, before I forget. How many of you know what is the Linux name space? Network name space. Good. For those of you who don't know, Linux name space is a Linux kernel feature that allows you to isolate resources and processes. In particular here, if you're talking about Neutron, it allows us to separate the IP routing table and the IP table and the IP table features. And how many of you... I already explained what is a floating IP name space. So do you all know now what is a floating IP name space? Okay. So again, it will respond to proxy. So it's only a router. Its role is to respond to our request on behalf of the Q router. And the floating IP name space is connected to the Q router using point-to-point connection from... So as you see on the right, FPR, floating... Flip to router and then to the left, router to flip on the Q router. So it's just a point-to-point connection. This command IP route get. Here I am trying to get where is my floating IP... If I want to go to my floating IP, how is this name space going to route me? And it's important to know this command. The reason is you have to know this command and it's important to know the architecture here. But if you are running this and you see that it's routing to somewhere you're not expecting, you will know that there is an issue here and I should check what's going on with my routing table. So I go into this router. I list my name spaces. We have only one floating IP name space because it's the only one responding to all the proxy R requests. And with IP net and S execute the ID of the name space, I can see this FG interface which responds to the R request on behalf of the Q router. And I can see the P2P connection, the point-to-point connection with the VRX. So if I do TCP DOM inside this name space, on this interface, the floating interface, ICMP, I am going to see the same result as before with no replies. So now I should check where is this router or name space thinks that it should go next. So I do IP route get and I specify the IP address of my floating IP. 10, 2, 4, 6, 1, 5, 5, 2, 3, 9. And you can see it's going to the point-to-point interface to the Q router. So let's see if it's going out of this interface. I do a TCP DOM on this interface and indeed it's going out of this interface. Okay. Okay, let's get rid of that. Next, I go to my Q router name space. And here some magic is happening. So all the way we have been saying that the request is coming from LIDAS IP to the floating IP. But here on the RFP interface, we're going to see the same. But after that, we have IP tables and we have Natting configured here. Everybody knows what is Natting. Okay. So Natting is going to translate the destination in the IP packet. It will look at the destination IP address, which it will find here that it's a floating IP. And it will translate it to something else. It will translate it to the private IP that you had specified for your instances when you launched your instance. And after the IP table, when it finishes this translation, then it will look at the routing table. So it will do the translation and then it will look at the routing table. This is very important to know. Okay. So if I go into this router, it's always, can you do that for me? Now you need my help? Yeah. I need your hands on that. Just click on the router. Yes, sir. Thank you. So here, I recommend to always name your devices. So here I have named my router as a summit. So I grabbed the ID for this router and then I attached to it the keyword key router and the ID. So here I grabbed it in the IP net NS command and then I attached, you can find it by attaching the keyword and the router ID here. Okay. So now let's see if I'm getting the request on the point-to-point connection between my floating IP name space and my crew router. First, I list the interfaces. And you'll notice that the RFP interface and the FPR interface have the same ID. It's just the first three letters are switched. So I do the same TCP DOM. I still see the same result as before. It's going to the floating IP. So I will check what does it have in the routing table for the floating IP. And you'll notice something weird here that it will show you that it's sending the, it thinks that it should send to its loopback interface. You see it dev, hello. So what's happening here? What's happening is that now the translation is taking place before the routing. So if I look up my IP table, I will see that in the pre-routing chain, so it's a pre-routing. This is very important. The floating IP is being translated into the private IP. And this is happening before we look up in the routing table. So now I should actually be looking for the private IP instead of the floating IP. And when I do that, I find that it's going out of the QR interface. The QR interface is the gateway that your instances are connected to. So if I do IPA show that I can specify one interface I will show, I will see that this is a default gateway. It's on the same subnet as my private IPs. And all my instances, if they need to go outside, they need to connect to this gateway. So I'm going to do TCP dump here to see if I'm getting anything different on this interface. So you will notice here a different result than before. So here the leader's IP is still the same because the source doesn't change. What changes is destination. The destination now is private IP. It's no longer the floating IP that he was trying to access. And this is normal. So far everything is good. So what is the next? So now next, this QR router is going to send the packet to the integration bridge. And from the integration bridge to the Linux bridge. I'm not going to go into the integration bridge. The reason is just acting as a layer two switch. Nothing very special happening here. The reason we have two switches is because the OVS cannot have IP tables. And we need IP tables here to have the security groups. When you configure security groups you block one port, you open one port. It's only the Linux bridge that has access to this. So, and here you will learn how to identify this port. The tap port connected as an instance because you saw that we have many, many instances. So I'll teach you how to identify this port, this tap interface. So should we open? We can find the mouse, yeah. Oh, here he is. Attract that mouse. If you're running the OVS, VS control, you need to do show. You'll see a lot of interfaces. And if you do IPA, it even gets worse. It's really overwhelming. So how do I know which interface I'm connecting to? So if I go and do NovaList and then I do NovaShow on my instance, I will find the instance ID. She's the instance ID in yellow. If you do then Verge, which is a KVM command, you see attached to ID 76. There's another KVM command, Verge DOM if list. Then I put the IP of my interface. And here you go. Here's the interface name. So let's do TCP DOM here. And we can do it straight on the host because it's a Linux kernel. It's on the Linux bridge. So we do TCP DOM. And the same result as before, it's going to the private IP request but no reply. So let's see if we're getting anything back. So we are sure that we are sending you to this. So let's see if we're getting anything. I change the TCP DOM filter. I do source host. I see if I'm getting anything back from my instance at 10 to 5233.99. And hey, indeed. Can you stop this here? My instance is asking everybody on the layer 2 domain, which is the Linux bridge and the obvious integration bridge. So who do we have there? So we know that we have our queue router there. But do we have leaders IP there? So now the request is getting to the instance. But the instance needs to reply back. So it has to know the IP area because for the instance, the IP of leader is on the same subnet as the private IP. For this reason, it seems I should send an R request to know who has it on this layer 2 domain. Who has this IP? It's asking who has this IP. So here, hey, leader, do you have this IP? Is this your IP? Looks like it. Well, no. No? Because from the point of view of the instance, this IP doesn't exist. Nobody is replying to this IP. Is it you and Diego? I'm just kidding. It's not you. It's not Diego's IP. Maybe it's the queue router IP. Again, this is from the point of view of the instance. He's trying to look where is this IP. Nobody is responding. Is it instance B? No. The queue router? No. Well, guess what? It's nobody's IP. From the point of view of the instance, nobody has this IP on this layer 2 domain. Nobody has it. So what happened? What caused this issue? I can exit of this. Just close. We don't need it anymore. On the right, you see leader's IP is 10 to 52 is 72.99.16. And he was always trying to reach the floating IP. But he didn't realize that his private IP was from the same range as his IP on his laptop. Because he thinks that if he doesn't have a conflict with his floating IP, then it's okay. But from the point of view of the instance, it needs to reply back. And it thinks that anything between the range of 10 to 52.0.0 and 10 to 52.255 is considered connected in the same layer 2 domain. So it just sends on the left side an R request, and it doesn't get any reply back. It thinks that nobody has this IP and keeps sending to never be able to reply back to leader. What happened here? Okay. So always be careful when you're configuring your subnets. Okay, leader. So I'm shooting myself in the foot. Thank you. No problem. Yeah, that was expected. Thank you. It looks like that solved my problem temporarily. Good. We're good. Well, later in the day, I'm trying to move in files, some files between my VMs. I can't. So here I go. What happened? Another problem in the same day? What are you trying to do? I'm trying to move in files between two servers. I can't. I'm trying to copy files. Okay. Can you ping them? In the ticket. Everything worked looking for me. The connectivity is fine. Yeah, let's cut the drum of this guy, because this is derailing really fast. So let me go ahead and check what's going on here. So we have here an overview of his architecture. You see the web and a DB. And right now he's doing a copy, an SCP, between the web and the DB that we have there. So those are now using only private IPs. We're not hitting any floating IP anymore. And they're sitting in different machines. We'll see that in a second. So yes, ping works. So it's not really my problem, my buddy. So you go ahead and troubleshoot it. Just kidding, before he escalates again. So ping does work. But for some reason, SCP does not work. You can SSH, but you can't really do much. Okay, just keep that in mind for now. Wait a second, I just talked to this guy. Come here, just take a look at this. So he does send an email to all sales, all top management and all IT, saying that once again, IT cannot deliver. Oh, my God, this is in danger. Didn't we just solve this problem? I mean, I just talked to him like five minutes ago. This is embarrassing. My phone will be ringing in a second. Hold on a second. Make sure this is on airplane mode. Let's drop that too. All right, I don't want this anymore. Have this happen to you? No. It's okay. You're not on camera, so you can raise your hand. So a little explanation of the instant. We have the web instance running on one side and we have the DB on the other side. And you can tell they should be able to just talk to each other through this private address space. So that's Unicorn, essentially. A true instance application in our example here, sitting on different compute nodes connected by an underlay network. So we're traversing the physical network. We're going through a router or an L3 device. And on top of that, we have a tunnel on the overlay network. A really common scenario would be to use something like VXLan tunnels. You can also do with GRE and some others. In our example with VXLan, if you were to get that picture before with the boxes and the network, and if you were to peel that off, this is what you would have. This here, it's pretty much what Mohammed was showing before, but now we have a traffic moving from one side to the other. It's going between instances or east to west in network lingo. And it's going through various places. But essentially, we're now going down the VXLan tunnels. So if you look at the picture, it is extremely overwhelming even for folks with a lot of baggage networking to say it's happening here. And you go and take a look. Unless you've done it before in your own environment and you know some... Over time, I guess, we know some of the usual suspects and you go and just check them. So all you're going to do is essentially mirroring here, and we know that we can ping. Let's chop half of the picture in the diagram, and let's just take a look at that for a second. So we'll focus on this side for now, but that applies to the other side to us once we're getting through the troubleshooting. So first, let's just take a quick look at the tunnel bridge. So if you come to the tunnel bridge, you'll be able to run some of the open vSwitch commands, essentially what Mohamed showed before, to display all of the ports. And you'll also be able to see in each one of these interfaces the endpoints, so you can do initial troubleshooting there. So if you don't have this, you would have a problem here. Now, I don't think this is a problem right now because I can ping, I can SSH, so it's probably not here. So let's just move along a little bit. So the next thing that we have to take a look is the Linux bridge, what's happening in there. There's various little components that could be in place. If we do a BRCTL and then we do the show the max, we see that we're actually learning. So we're okay on that front as well. Now, I do a show STP, and I see also that everything is in order. It's forwarding. I don't seem to have a problem in here. The next obvious place for us to take a look at would be in the instance itself. As we go inside of the instance, we can definitely ping from there as well, so we can see that. And to verify that it could be something else, I just did an if config. I don't see anything out of the ordinary getting out of here. So this leads me to think that, you know, it's got to be something else. So let's try to troubleshoot this as a group. Do you guys think this could be something like an IP table in the user land running in the instance? Anybody think this could be a firewall issue? Somehow like a rate limit or something? No. Okay. How about we say a security group? Maybe. Okay. Did I hurt the MTU? MTU, right. You've been there before, right? So the aha moment that we just had. So if you take a look back on this, it might be a bit hard to see on the screen, so zoom in. Does that MTU look correct for you guys? 1500? It is if it's physical. If it's physical or if it's not open stock and there's no VX LAN. Now that number, my friends, that is the magic number here. So what number should that be, Mohammed? Yeah, exactly. So we saw it at 1,500 is normal everywhere. Like before open stock, nobody configured less than that. Like we had physical or virtual machines everywhere. You can configure 1,500. But once you introduce the VX LAN tunnel, it adds on top of the packet 50 bytes. And if it's 1,500, it will become 1,550 when it's talking with the next hop. And next hop doesn't agree on that and start dropping packets now. Why does ICMP work? Well, so smaller packets will be fine. But once you try something else, then you'll start having some issues. Now, if we go back to this and we change that, and that's how it should look like. A normal behavior would be you spin it up an instance and then once the instance gets requested on the HTTP, you'll be all set and you won't have that. There are a few scenarios where this could be happening. Now, knowing that, you can just change with config and that's what I did just to test. And then you can see that the SCP works right away. Before, it would stall and then it would stop. So first you change, and then, of course, I ran SCP to confirm that the issue is actually gone now. So you're all clear. There is no more drama necessary. You can go ahead and launch Unicorn. Cool. Thank you. Thank you. Looks like I did shoot myself in the foot. By the way, I'm not that bad in real life. I take my hat off to this guy. He's just so humble here. Yeah. Just wrap up and so you can tell upper management what's happening. We had two incidents. Number one would be what we are calling the clash of subnets. So think about this. And this could probably work here, too. As we have a real environment, we have a rack in sitting here, I don't think if you were using one of your laptops here and you get an IP address that conflicts with some of the workload that we have in our current open stack here at the summit, you would probably find yourself with case number one. So when we have private clouds, that's something that we have to keep in mind. Then number two, I really enjoy this. A lot of hands up for that. I think a lot of us at old school will see the 1500 MTU and what MTU can cost you. So something to watch out. And then as we go deeper to understand this, because it makes no sense. Why would that happen at all? Well, so it happens at Unicorn, which is a super cool next generation user application. Just one comment here. It's not normal because when you have open stack and running DHCP, by default the configuration that DHCP will push to the instance is going to be 1450. So something went wrong here. Somebody manually configured this or some developer from the old school came and said, oh, this is 1,500. Why should it be 1,500? They didn't know what the consequences. That's right. Yeah, that's right. And then what happened is they're doing this ICD pipeline and then DD, one of the devs, just put a hit template with the subnet. So there we have the clash of the subnet. He just put that in there. And also to make his life easier as well and he's using hit, he created an image where he hardcoded the 1,500. So, you know, he set it up the interface you have there. You got a picture of DD. That's the last non-picture of him. To wrap up a little bit, we've talked about two scenarios. We kind of have to rush a little bit through it and went through all of the different layers that we had, so external to internal, north to south, and network lingo. And again, make sure that you understand this. Now we're having a private cloud, so extra thoughts on that process. And then an instance to East to West where DMTU played a role in there. To learn a little bit more, I think some of us will be found to know that we're still mentioning TCPIP illustrated from Stevens. It has been a Bible for a lot of us. It's still highly recommended. One stack networking. Excellent source of information as well. We've put out a few of the RFCs that essentially we've mentioned here. We've consulted at one place or another and a few of the Linux man pages. These slides and a few of the diagrams that we have, we're gonna be publishing on the website that we have for VXRack. So feel free to go and fetch those out. And as a bonus, we will also be delivering this in the website. So the full left to right with all of the boxes and all that. I have that on an XML format, so you can actually change the layers and modify so you don't have to redo all of that. So it's read and write so you can enjoy that with the commands that apply to the different places. So you might improve that as you go as well with your team. A little bit of extra bonus tips. Some of the commands, you will have that on the slide as well. And let me see if we have... I don't think we have time for QA. We're kind of running short on time, but maybe we can take a few questions here. So do we have any questions? Yes, do you mind? There's a microphone right here. Do you mind coming to the microphone? And just for the guys living, we have a raffle. So if you want to stay and there's some... There's a little gift in here as well. Yes, sir. Just my question is to avoid saying are you two... Maybe if you enable on all the path jumbo frame, you will not have this problem. What? If you enable jumbo frame along the path, the old path, you don't have the problems with the NTU. So I'll repeat the question. If you enable jumbo frame throughout all the path, you wouldn't have that problem. Yeah, so you would have to reconfigure your physical devices. Exactly. That's a good point. As well as the VxLan tunnel as well. Thank you. Do we have another question? Maybe we can take a couple of extra questions. All right. All right. So... Well, thank you. We are on booth A1. We'll take more questions there if you guys want. If you want to see... Thank you. This is VXREC Neutrino. Please come and join us. And we'll talk there. Thank you very much. For the ruffle. For the lucky one. All right. So... One with two. What are we? Oh, man. Only one. Two, six, three. It's a Bluetooth speaker. Two, six, three. It's there. Over there. Yeah. Give him a hand.