 Well, hello and welcome again to another OpenShift Commons briefing. We're really pleased to have with us today Alvin Picky from InfoGuyo, a recent new member to the OpenShift Commons, and a number of the from WeaveWorks are here as well. Today, the topic that we're talking about today is testing web services with traffic control using some of the WeaveWorks scopes and showing it working on. I saw Alvin give this talk at KubeCon in London, and I knew I had to get them to do it on OpenShift because it's one of the most full testing techniques I've seen in a long time. And as well, it also shows off a lot of what you can do with Weave Scopes. So what I'm hoping we'll do today is get Alvin to tell us all about testing and traffic control. But then we'll also try and push some of the folks from WeaveWorks, so we have a number of them on the phones and Ilya and others, to maybe give us a little bit of another demo in the second half during the Q&A of some of the new features of WeaveWorks scope product, and working with some of the new features of KubeNeti. So this should be a really interesting talk, and it is interesting today. So I'm going to let Alvin introduce himself and get away from there. Thank you for the introduction. So I will talk about testing web services with traffic control on OpenShift. I will start by introducing myself. I am Albon. I work on the container runtime rocket in the last month, and I'm currently the tech lead on Rocket. And previously, two years ago, I worked on traffic control for a different use case. It was for multilayer application in-car for the automobile industry, and I'm trying to reuse that knowledge in a different context. I work at KINFOG. We are a very based company, and we work on different foundational new technology. We work on Rocket with CoreS. We work on SystemD, and we have part of the SystemD team is in Berlin as well, so that's quite good. And we work on Linux, so whenever we need to make changes on Linux, we can do that as well. And we work on OS3. OS3 is the kind of a guide for operating system binaries. You can find more about that, but I will start with the plan. So I want to talk about what is traffic control and how does it work on Linux. And then how it can be used for testing in applications. And then I will do two demos. The demo will be on OpenShift. It will use the capabilities, and it is based on WeaverScroll. So first, what is traffic control? How does it work on Linux? So the traffic control exists for a long time, and it can be used for different use cases. It can be used to have a fairly solution of the bandwidth when a web server is connected to several clients. We don't want to buy a client to have to be there, start and have a share amount of bandwidth between the clients. We could want to reserve bandwidth to specific applications. Now we want to avoid a buffer bloat. Buffer bloat is when a router on the internet has two large buffers that can introduce problems with latency and congestion on the network. But here, we don't use traffic control for that. We will use traffic control for testing web services. Traffic control on Linux is implemented in the Linux channel using something called queuing discipline. Queuing discipline or QDisk is something you can plug on a network interface. For example, on ETH0, we can plug queuing discipline and that will decide what to do with a packet to emit, when to emit them, on which packet to emit first. Queuing discipline can be configured on Linux with a TC command. When it uses a netlink socket to talk to the kernel to decide to talk to the kernel to configure the queuing discipline. By default, there is always a queuing discipline and basic one. There are different kinds of queuing discipline. I will just mention one example here. This example is called a stochastic furnace queuing SFQ. On this one, put the different TCP connection in a different queue and then do a run or win over them to select which packet to emit next. That's not the one used by people, but just an example of queuing discipline. And there are plenty of different queuing discipline that can be configured differently. But how can we use that for testing? Actually, there is a queuing discipline called a network emulator or NET-EM. And that NET-EM has different configuration parameters. There is, for example, the bandwidth. We can configure it to limit the bandwidth of the traffic emitted by the network interface. Or we can increase the latency. Or we can say, hey, I want to have 2% packet loss. Or plenty of other option that you can find in a new page of NET-EM. And we are going to use that. But we don't want to use that to configure the machine, the computer as a whole, but we would like to configure only the network used by one application. On one way to do that is to use containers. Each container can run a specific application that will run in a different network space. That means that network space will have its own network interface In this example, we have container one with one network interface called ETH0 and container two having a different network interface. And then we can have a testing framework configuring a queuing discipline on each specific network interface to add some latency or to drop some packet of NET-EM. We want to do that in OpenShift in Kubernetes. So OpenShift use Kubernetes. And in Kubernetes, there is a different... There is the concept of pods. Pods are a group of applications running together in the same context, in this case in the same network space. And they can communicate together on the network. And we want to configure the network parameter on each pod to save for this pod as a high latency or different parameter for the bandwidth. I want to be able to configure different scenarios. And for that, I implemented a small demo called TCD for Traffic Control Demo that will run on each Kubernetes node. And it's a start-res demo and it will receive command on the Unix okay either through GRPC or through Debus. And it accepts remote meta calls such as configure address to say I want this network interface to have some latency or some limited bandwidth. You can find the source of TCD online on Gitm. And then I integrated TCD with WebScope. I will show you a demo after. WebScope is an application showing you all your calls on the cluster all your containers on the cluster. And in this slide, we see one view of one container and I added a few buttons. Yeah, I'll take to traffic control. So when I will show you in the demo after what it does but we have a with scope is composed by two part the with scope app which is the web view on scope pod which then on every node. On the scope pod, we will receive the commands when we click on the buttons and to talk to the traffic control demo and configure the traffic control for each specific pod. I will start a demo now. So stopping my slides. So first, this is an open shift and I pre-configured different applications on it. I can see it on the web UI here or I can see it from the terminal using OC commands or Ctl since it's using Kubernetes. In this example, I have different application and the first one I will show you was one called ping test. And this is actually just a very small screen. This is starting replication controller of three pods and each of them will run a simple script. That's it. We don't have a small file on the internet if it could come to the TV. So now I'm going to re-score. Now I have a view on all my pods on my containers and I can see effectively the three ping tests for the directly seen open shift. But if I click on one of them, with scope allows me to have different statistics. So I can monitor my pod. I can see, for example, the commands that is running. If it takes much CPU or not, since it mostly doesn't see ping, it doesn't take CPU. That's good. And I have some button to interact with it. One of them is called attach. I'm clicking on it. What it does is it will show the terminal as a related to this container. So now I can see actually what it does. It doesn't load a small file, a 15 bytes file on internet every two seconds. And I see my network is quite fast. It takes less than one second to do that. But let's see one of the three buttons. I have added one button called traffic speed slow. Medium or fast. Let's click on this one. What it does, it will configure the latency to add some latency to that container. It only does that not on a complete computer, but only on the specific container. So the other container should be unmanifed. So since I click on this button, I see that it takes more time to download the small file. It takes more than one second. And if I click on the slow button, it will add a latency of two seconds. And since the loading file on internet will have... It has some... It does save around one trick to establish the TCP connection. It actually takes more than six seconds. So I left this one like that. And I will see another team test. That's another replica of it. This one should not be affected by the configuration because each container has their own configuration. That was the integration with DCD. But how can we use it for real testing for that? I took the Kubernetes example application called guestbook. I will show you it right now. So guestbook is a really simple application. It has frontend written with Apache on PHP, kind of backend interface. And I can add some message here and install it in the backend. And I see here it seems to work quite fine. But actually does it really work? Here I have a good network connection. So I might not see the problem with that. But I'm going to go back to with scope and find... Sorry, we'll find the correct... So I found the container running Apache on PHP. That's the frontend of my application. The application that you can see just here. And I will add some latency. So here I have two seconds latency. So if I refresh the page, I press refresh. I see it quite slow. I see the release spinning here. It doesn't think... Oh, no, it's so fresh. But actually, we are on my message. They seem to be lost. Are they really lost? Did I... Oh, no, they appear. It was just slow. I think that's a user experience bug because when I have refresh, at first it says it's refreshed. And then all my messages are lost. And then I don't have any user feedback. And later they come. It would be good to have some message, say loading or something, some user feedback. And for that, I developed a new version of the Facebook. It's really similar. It's really similar. But it has different user feedback. Now I will add some latency as well to that new version. I will show you how it's configured. So if I go back to the list of replication controller, I put one replication controller for the version one of the Facebook and one for the version two of the Facebook. Sorry, of the Facebook. And they both connect to the same way this back end. So in this one, that's the new version. I will put some latency there. And if I go to actually version two, here I see the message loading. So that's a user feedback message. It means that the JavaScript code which we are fetching the data is still running. And if I refresh again, take some time to get the new page. And here it's loading. So I know that it's not finished. Okay. And now I will add a new message. Good evening. And I have another message sending. So I know that it's not finished yet. So that's a good user feedback. So far, I only tested manually. That's I go to Firefox, I click on button, and I changed the configuration thanks to WaveScope. It would be quite good to automatize that. If I have unit test, I don't want unit test to require a tester to click on Firefox to check if it works correctly. So let me go back to my slides. I would like to do that in a testing framework. And there are different testing frameworks that exist. There is a Selenium to simulate different kind of weather like Firefox on others. There is Agoti and Gingo on GobiGa. I will show you a demo with that. So here I have two files. And that's configure small unit small test. I use Agoti connected to the Chrome driver. And it will run the test, the following test. That's a small script returning low. And what it does, I hope you can see, it connects to the following web page. It expects to fetch the page correctly. And then it can find in the HTML page this specific attribute in the HTML page. And then it expects to find the message loading and with that I can automatize the test process. I can check that it gets all the message for the user feedback around. And I can run this inside Gingo. And hopefully it will load Chrome in this example. Since I configured the latency to quite high, it takes some time to load the page. And then the Golan script checks that the message loading is actually there. And then it sends a message. And it checks that it returns something. And then the test passed with success. If I was actually not using control, it might not be possible to do that because the message might be so fast that the test might not have a seat. So let's see if I predict back too fast. It might not work. So it checked the page really fast. Actually, the script didn't have time to get the information from the HTML page before it went away. My demo for today. And I would like that to be this kind of test with traffic control. I would like that to be developed more and that more web series use this kind of techniques to develop, to test the application. And I have a wish list of things that I did not implement. But I would like to have. For example, when we attach a Q&A on a network interface, we apply the same rules for all the traffic. But with my implementation, it's not yet possible to classify the traffic in different tasks. For example, I would like to classify traffic on HTTP to one-eyed address or to DNS or to apply different latency and different parameters on it. It's possible to do that on Inux. On Inux, you can attach on network interface Q&A on filters and classes. And there is one filter called U32. What it does is inspect the network packet and look inside and apply a mask on it to detect, for example, what destination plot it uses. Or what destination IP it has. On both of that, you can classify the traffic into different classes. And each class can have a different Q&A, so you can attach a network calculator Q&A with some latency, another one with some packet row, etc. And my small demo, TCD, doesn't implement that, but it will be possible to add that feature. It will be possible to use TCD to configure the network traffic control so that one pod can have different parameters when it talks to one pod or another. For example, when it talks to this pod, it will have a latency of 101 milliseconds or to this pod, actually don't hold the packets. I think this kind of thing could be useful for testing, for example, the Vaft consensus algorithm, which is used in TCD. TCD implements a membership protocol. Each node sends a heartbeat at a specific interval, depending on the configuration, every 100 milliseconds. And then if there is no connection, there is an election timeout, and we start a new election to select the new leader. If we have a high latency like 5 seconds, as you expect, it might disconnect the node. And I will be interested into testing that with traffic control. There is more techniques which could be used with a Bacchet packet executor, BPR. It's possible to write some programming C and compile them in a specific machine language, BPR, and upload that in the kernel. And that would be useful so that the Qindice thing can use that program to decide how to classify the packets. So it could write more complex rules with that. Recently, BPR was extended into extended BPR, and it implements a new feature called EBPR maps. Maps are actually a hash table that is in memory, which can be shared between the kernel and user space. That hash table can be wished via TCD to read some statistics. And the Qindice thing could execute the program and update the hash tables in memory. I think that might be interesting to develop that. Thank you. That's almost the end of my talk. If you want to try the demo yourself, you can get that GitHub history. And it has instruction to repeat the demo step by step. And the source code is available. You can read the blog post that Alessandro here posted recently. Just before stopping, I would like to mention the SystemB conference in Berlin in September. I will be at ContainerCon in Toronto. Would you? An awesome place to go and hang out and learn about this stuff. There will be a CNCF there and the Cloud Native Foundation folks will be there talking and networking a little bit and try to promote work that we're all on. Thanks, Alvin. Alvin, I have a quick question regarding you. You talked about this being a number of these things being on your wish list. Where does that work need to be done in order to get your wish list fulfilled? Is that on TV? Or is that something that is just something that needs to be figured into Weave Scope? Or if someone wants to work on that, where do they need to go? Let me go back to this slide. So the features in the kernel are already implemented for our world, so that work is done. But in this slide I show the demand for our world, TCD, at the moment, only work with very basic features. It can only change the latency on the boundaries on packet drop. But it could have an intrusive command on GIPC, or DPS, remote method calls. And it would be possible to add more method calls to add more features. And then TCD, when configuring the network traffic control, could code the TCD to do that. So the question was where to do that code. It would be possible to do it in TCD. And then Weave Scope could connect to that if there is a UI in Weave Scope to do that. Or it could be an external framework, testing framework. Yeah, so that's great. Good. That's where you wanted to point people to. Perhaps we can round up a few of the folks to collaborate with on getting that done. That's really what we're trying to do here at Commons is to find other people to collaborate, distribute the code, and give you some feedback on your work. I know we also have a number from Weave Scope on the call. And I would love it if I could coerce one of them. There aren't any questions. So you did a great job, Albin, answering most of the questions. I think the questions will come on the email list afterwards. But I'm wondering if we could get Alphonse maybe to share his screen and show off a little bit of some of the new features of Weave Scope because I think we could use the Kubernetes. That would be great. We'll sneak that in here today, too. Hello. I'm Alfonso Acosta and I'm a software engineer with working at the Scope team. And Alfon has done a pretty good job at showing some of the capabilities of Scope. But I wanted to show what we've been very recently working very hard on, which is the deep integration with Kubernetes. By the way, this all works on an OpenShift as well. But I didn't have an OpenShift cluster around. So I'm going to be demoing a Kubernetes cluster instead. The first feature we added, which is simple but shouldn't be undervalued, is how you install Scope on Kubernetes. A lot of monitoring solutions are complicated to install. It's actually part of the philosophy of WeWorks, to make things as seamless as possible for developers and operations. And we've put together a little service, which generates the Kubernetes resources for you so that you can install Scope with a single command. I've already installed it, so it will tell you that everything exists. But it's as simple as using the URL capability of kubectl in Kubernetes. kubectl is the command line CLI for Kubernetes. And it allows you to deploy Scope in your infrastructure with a single command. It will create an engine in every single machine in your Kubernetes cluster. And what happens when you connect to Scope is that you get this view. In fact, our awesome frontend team has been working on a high-contracts view for this type of presentations. I hope you can see it more clearly now. What we're seeing here is a very simple application in which we have multiple clients, which you can see here and here, obtaining information from a web service, which goes through a frontend, connects to an app, and it accesses search services, a database to know how many times the service was accessed, so on and so forth. And what Alvin has, this is the container view, by the way, what each of these hexagons represents a container. What Alvin hasn't been showing is the new Kubernetes-related views. So they follow the same philosophy as the container view you've seen before. But instead of showing containers you're showing pods, edges represent connections between pods. You see how many containers a pod has and how they're connected together. And we have also added more views. We have views for replica sets and deployments and services. And we have reached contextual information about each of the pods. So, for instance, if we clicked on a client, it will show you to what pods is being connected at this very moment. It also will show you the pods and you can navigate through them. Now this is the visuals of the pod. In fact, I could be showing the logs of the client by clicking on this command. Actually, the client doesn't log anything to a standard output. But we could show that in the front end, for instance. And we would be able to see the requests here they go in coming into the client. So scope is not only a visualization product, but also a monitoring and controlling product. A typical use case is you have your infrastructure and you want to scale it up, and you want to, for instance, add an extra client to add more load, or you want to add another front end. You would do that in Kubernetes through replication. Well, in a few months ago, you would do that through replication controllers, but now they've introduced deployments and replica sets. So in order to scale the client, the client has a corresponding deployment and it says it has two replicas. But let's add an extra one. We have two controls here to reduce the number of replicas or increase them. If we click on plus and wait for a little bit, we'll see that the number of client pods should increase. It will take a little bit, but hold on. Let me check again. Okay. It doesn't seem to be working at this point probably because it's a local cluster, but if we wait it for a bit, we should be able to get, oh, here we go. We have three replicas and there you go. Now we have three pods. We also have contextual information about the processes running in the pod. We have the pod IPs, which are always very useful, and we also introduced a new feature, which is not related to Kubernetes but a general feature, which allows us to search textually. So, for instance, if we wanted to search containers by IP or by CPU, we will be able to see them here. Let's see if we have an example. Here, we highlight all the client containers and it also shows us what other views contain those names. I think that's more or less what I wanted to show you. If you have any questions, I'll be very happy to answer them. That is super useful for people who are hosts or operators. I'm just really, it's a very nice visualization of everything that's going on under the hood. So, I actually don't see any questions in the chat at the moment. I think that between the blog posts that Alexandro wrote and the work that Alvin did here today, I think we covered a lot of territory and gave some really good insights into how to use Weave Scope. I'm looking forward to trying it out on OpenShift. Probably everybody knows we have OpenShift online and OpenShift dedicated, so I'm thinking I can coerce the operations folks in testing it out too. So, I think there's going to be some interesting use cases coming up in larger deployments for some of this work. I'm really appreciative of you taking the time today to showcase all of this, and we will be posting this as a recording on YouTube and in the blog.OpenShift.com links will be there and to the YouTube and other places, and we will definitely see you Alvin in Toronto and next in two weeks in Berlin. If anyone has any questions there, welcome. I see one. Ilya, do you want to try and show the install process on OpenShift now? I mean, if we have time, sure. Yeah, I've got time to record. We've got a few more minutes. Yeah, cool. Okay. Yeah, right. Can we give them a few minutes? Yeah. Yeah. So, okay, it's kind of sharing. I'm sorry, that's the point. Okay. Then I'm going to coerce you into writing up as a blog post. Okay. Yeah, that's right. Yeah. So, I was just going to show really quick how you install Scope on OpenShift. Oh, perfect. Can everybody see my terminal here? Is that quite good font? Yep, font size. Okay, cool. So, I've written down some notes here just in case I forget. So, I've just logged in to OpenShift.com for the purpose of these demo. I've prefaced the images, so we won't have to wait for images to download from Docker Hub. And I see login. And login is admin here. Yeah. Okay, cool. I create a new project. Okay. Next, I need to run a couple of commands here to apply a specific policy for Scope to work properly on OpenShift, setting administrative privileges. Okay. So, we should be all set. And now this is basically the command that installs Scope. Sorry, I forgot the scheme. Yeah. So, we have this URL, which you pass to OCCreate or kubectl, if you like, that will install Scope on Kubernetes or OpenShift all the same URL for more. And right now, if we do get pods here, we should see that some pods have been created. Okay. Okay. Well, in the meantime, I can log into OpenShift console here. This is obviously a default install from Reagan. Okay. There we go. Okay. Cool. We have these two containers running. One, there is, potentially, what there is is a, there's a probe and the app. So, those are the two components. And probe is what runs on each node and that runs as a demon set. And the app is just a regular replication controller. And there is also service for the app. Right. So, now I'd like to access the app and I'll go to the OpenShift console here. Find Scope and create a root for it. Okay. Cool. And this should give me a Z-Bio URL that I can access. And, oh, here we go. So, this loads Scope UI. It's still... Still thinking about it. Still thinking. Yeah. So, essentially, this is what you need to install it. Yeah. Okay. Here we go. So, it loaded, I think. Okay. Well, that's a reload. Yeah, this... I might be running a couple of more VMs. So, yeah, maybe there isn't a wide slope. Okay. What else? And I mean, I'll also create an app here that we can look at. See new project line up. And I have this really quite simple app here. Yeah. Something like that. Load it. Load it. It's loaded. It's pretty high. I wonder what else does that. Okay. This is the fun of running a live demo on your machine. You'll have to get a bigger machine. That's right. Yeah. Well, what I'm seeing is, essentially, this seems like Docker taking a lot of CPU. I don't know. That's interesting. Okay. Well, there is OpenShift. I don't have a bit of a problem. But this is, I assume it's running latest OpenShift too soon. Really, it's there. Okay. Well, I mean, we can see that Scope is running on OpenShift. Fortunately, we have a little load issue with this VM. I'll have to investigate that. That's okay. That's okay. I think live demos off the cuff are always fun to watch. But it's really... I think it's a good example of the compatibility between Kubernetes and OpenShift to pretty much the same commands that you run. That's right. OC does pretty much the same thing as CUBE CLT plus all of the OpenShift work too. We try very hard at OpenShift to say in sync with the latest release of Kubernetes. We're pretty much in a couple of weak cadence after each release. I think we're very happy with the world of Kubernetes, the cluster management. I'm seriously thrilled to see the weak Scope stuff looking so nicely on OpenShift. So there's a lot to say. Thank you for... And that's kind of the power of collaborating together on this. But a lot of people work behind the scenes to get everybody up to speed on OpenShift. And I really appreciate the work you guys did. So hopefully, we can do a lot more stuff together in the coming months. And we'll see about trying to run some of this at scale. Yeah, absolutely. Online and OpenShift dedicated as well. I think the other folks who are hosts and operators who are using OpenShift on the hood like GetUpCloud and CloudIsle and a few other pods providers out there are going to be interested in this. I really do want to take another shout out to Albin. The traffic control stuff is something and latency issues are near and dear to my heart as a web app developer. So I think highlighting that and how to use Weave Scope for that was really awesome. I'm hoping we can entice some developers to start thinking about this in a more realistic way. Thanks again, everybody. I'm not seeing any more questions. Going one, going twice. We'll talk to you all next week again for the next OpenShift Commons. And you can find this recording on our blog post next week. Probably by Monday should be up and available. So thanks again.