 All right, it is 1130 a.m. Eastern. Hello, everyone. Welcome to the Ask the Experts session for the Systems Engineering and Hardware Track. We have a panel of experts here who are here to answer all your questions around systems and networking. So please put in all your questions into the chat and we will field them as we see them. I'm gonna hand it over to you now, Aaron. Please take over. And thank you all for doing this. All right, thanks, Hervashi. So on the panel, we have Wyman Long, Priorit Bergava, Thomas Haller and Dan Winship. So we have a good mix of sort of networking expertise, both in sort of the desktop server as well as cloud and kernel expertise. So feel free to ask questions anywhere you want. And if no one can answer them, then sorry. First question, we'll just get right into it. So first question is from Richard Jones. Are you following risk five and what effects might it have on the server landscape? I'll take a shot at answering that. Richard, that's a great question. Historically, we've seen a little bit of interest in risk five over the years or the past little while, rather. I'm following it purely from a point of view that it may become something, may become a product, it may become something more real someday. Adding another architecture is, as always, very interesting at Red Hat. It's not just an engineering effort, there's QE, there's productization, documentation, everything that's involved in there. Once we see enough interest from partners and customers, that's when we'll have a real serious discussion about risk five. I guess along that, there are some efforts to have data centers that are non-X86 and I guess there are some challenges related to that both from a networking side and just a systems execution side. So are there any special considerations from networking or systems that we have to keep in mind? Sorry, Aaron, my screen just went out and I heard the very end of that. Could you repeat that? My apologies. Yeah, so there's sort of other data centers that are starting to migrate to non-intel hardware and that can bring its own challenges and what sort of considerations do we have to have from systems and networking side sort of when thinking about running sort of the full solution, like not just say kernel and OS utilities but like the networking side and hardware enablement and offload enablement, all that sort of stuff, what are the things we have to keep in mind as special considerations? Well, I think from my, I'm primarily kernel focused and so I always like to say we're the bottom of the pile as it relates to the rest of the stack. We try for parity across all architectures with our products. So it doesn't matter what architecture you're running. That again takes effort. It's not just a kind of snap your fingers and hope it works kind of effort. There are things like we have to make sure that the drivers and the devices that these systems adopt are maintainable. We've had in the past where some drivers have come in on some architectures and could not be maintained for long periods of time. We also have to look at the overall specific purpose of some of these processors. A risk five, for example, as you've noted, it has a very much a server footprint, whereas ARM, it was hoped, would have both a server and a laptop footprint. So I think those types of things definitely play into the decisions that are made. There are technical issues throughout the stack that I could get into, but I think I would potentially bore everybody on this call. So I don't want to get too far into that. And with that long rambling answer, I'll end it there. So I work on OpenShift networking and from our perspective, we just use the features that the kernel gives us. So if REL starts supporting other architectures, then we can support them at the higher levels. And yeah, we don't really even notice architecture that much. Okay. Any other questions? It's a pretty short panel so far. Maybe like Thomas and Dan, you can talk a little bit about what's happening in like server side networking developments. I can say something. I work on network manager and in general in the management of networking in REL. So what we want to do is the network manager is the component for configuring networking on the Linux host. And there are other components on top of that that use the API that network manager provides. And I think the point is that we believe that the network manager should be the series of space component that provides an API to other components. So on top of that, we have another project it's called NM State, which is kind of a simply, it provides a different API to applications. And that's it I think. For instance, in OpenShift we're looking at using NM State because people have various complicated networking configurations in their clusters and we don't want everybody running if config or whatever on their OpenShift nodes doing random strange things to the network. So NM State is great because it lets them just sort of declaratively explain their networking configuration. I think you can get that set up. That's still a few releases in the future for us. So someone is just asking, I am part of OpenShift support team and could see we are now moving to OVN from OVS. How do we see OVN helping us moving away from IP tables? What if the North Depots go down? Do they have any mechanism to come up without causing any network disruptions? Okay, so moving from or to OVN from OVS. So the current OpenShift network plugin OpenShift SDN is built on raw OVS basically. OVN is also built on OVS. It's just another layer of abstraction on top of it. So yeah, in OpenShift 4.6, which is coming out this fall, OVN will be, OVN Kubernetes, the plugin will be GA. How do we see OVN helping us moving away from IP tables? I mean, OVN Kubernetes doesn't use IP tables at all. OpenShift SDN and a lot of other Kubernetes network plugins use IP tables to do all the service proxying. And so in a very large cluster, you can have tens of thousands of IP tables rules and things get kind of slow after a while and it causes problems. OVN Kubernetes uses OVN Flows, which compile down to OVS OpenFlow to do all of the load balancing stuff. So it doesn't have the IP tables problems. It's having some problems of its own. We're working through those fixing performance issues. But yeah, I mean, basically, if you're using OVN Kubernetes, you are moved away from IP tables. It uses maybe like 10 IP tables rules total. What if the North Depots go down? So one problem that we had had with OpenShift SDN even is that when doing a cluster upgrade, every node gets restarted, every pod gets restarted. And when you restart the OVS pods, that causes disruption because when OVS isn't running, it can't be routing traffic. So we eventually moved to an architecture where instead of running OVS in pods, we finally moved it back into the system image. So it actually runs as a system deservice on the node rather than as a pod. And so that way it only gets restarted when the node itself gets restarted and there are no pods on it. And so it doesn't interrupt the traffic. So the idea is the same with OVN, that anything that is actually critical to the routing of traffic runs on the node itself. In the North Deep case, if the North Deep pods go down, that means we can't make changes to the network connectivity, whatever, until they come back up. But North Deep isn't actually needed to keep the packets flowing. So if North Deep goes down, the API server, the scheduler, whatever notices that, oh, it's gone down, start a new North Deep pod. And it just does that and it works because the network will still be there. I guess there's some trickiness with, something has to tell North Deep about the new pod. But the North Deep is running in HA mode. So as long as all three North Deep pods don't go down at the same time, then you're fine. And if all three North Deep pods do go down at the same time, you probably have bigger problems, like lightning strike or something. Further questions, type them in. Thanks Dan, I also wanted to point out that Till put a link in the chat to NM State. Ah, okay. So Prarit, tell us about REL8. I just noted, I was just scanning down the people list and I'll tell you this great story about this shirt. This is actually one of my favorite red hat shirts. Denise Dumas gave this to me and I saw her name in the people list. So she's, I mean, you get a nice big shout out. And I'll do her an extra shout out. I don't know if she's doing a session on diversity at DevConf US, but if she is, everybody on this call needs to go and listen to what she says. But look, REL8 is our primary product, sorry, one of our primary products here at Red Hat. We are working hard on REL9. So the shirt is almost out of date. I'll need to ask Denise to give me one. Oh, I just noticed this. She's giving everybody on this call, you have to go, you have to go, you have to go. Listen to what she says. And we're working hard on REL9. REL8 was an interesting process. REL9 is going to be even more interesting. We're moving a lot of our development into more open areas. We're working right now with CentOS Stream, as people may have heard of that as a thing now. We're working on figuring out how we're going to work with CentOS Stream and what our development process will be with CentOS Stream. And that just doesn't go for the kernel. That goes for every package in REL is going to be more outward facing. We are supposed to be an open source company and it's time we acted like one is what I like saying to people these days. REL8 is humming along wonderfully. REL9 is going to do the same, is what I'll say. Yeah, thanks. And Denise's keynote is tomorrow at 9.30 a.m. So like Prarit said, should go and attend. Be quite good. All right. So Irvashi asks, what are some cool technologies that you all are working on in the systems and networking area? Things that people are not familiar in the area would like to know about. Let's make Wayman answer that. Yeah, let me start with that. Okay. I'm a kernel engineer working mainly in the core kernel area, including locking control group as well as some many MM related stuff. Right now I'm trying to back putting one of the upstream memory cycle changes that will help us to reduce the consumption of kernel memory in the kernel. They will, I hope, have an open ship to allow them to use, to have more container with a given set amount of memory. The way that the container work is it make use of the two kernel feature. Primarily one is the C-group to partition the resources in a system. The other one is namespace to isolate the name so that one container are not supposed to see or stop a name from a different container or from the whole system. And also in order to, for the case of memory C-group, in order to control the amount of resource you are allowed to use in a container, you have to set up some kind of limit like how much memory you're supposed to use. And that includes everything the application within the container will consume, like all the pay cash or the file buffer and also all the internal data structure that the kernel maintain for all the processes within the container. And the kernel memory that the system use is that each container will have their own set of what we call the K-main cache. And with the K-main cache, in order to have better performance and you have also of caches within the K-main cache, including the perceived cache as well as per node cache. And it's those caches that can consume quite a bit of memory. So right now, each container or specifically each memory C-group have their own set of K-main cache. And so, and there's no sharing between different container. And as a result, you consume a lot more memory than you are actually using because some of the memory are being kept within the K-main cache. In the K-main cache, you allocate a set of pages for each what we call a slab. And usually you are not going to use up everything within the slab. So the free space left cannot be used by other container. With the upcoming full kernel, we are going to beg for a change that allow us to share the K-main cache between different container. So instead of one cache for each container, we can have one single cache that are shared by all. And in this way, you reduce the consumption of memory by quite a bit and which hopefully will reduce the amount memory consumed by each container. And so we hope we can have more container within a system. And yeah, that's it for me. So I will let the other panelists talk about what they are doing for the new releases. I guess I'll go next. So right now I'm working a lot on stack IPv4, IPv6 stuff and OpenShift. I mean, I think that's pretty much a solved problem like out there in the rest of the networking world, but for Kubernetes for a long time it hasn't been able to do dual stack. So Microsoft has been working on that a lot upstream to get dual stack support for Kubernetes on Azure. And we've been working with them upstream. We're doing a lot of work in OVN Kubernetes and in OpenShift getting all that working for various customers that need to do that. And really the big networking news in OpenShift these days is OVN though. I just talked about that. So I think that's about all I've got. Okay, I can go next. I work on Network Manager as I said, and my Network Manager is a rather old project already, but I really have the feeling that each release kind of is better than before. So I'm very happy about that. And I think that is also attributed that he significantly improved our continuous integration and testing over the past years. That is not a goal in itself, but it really helps that our software, I think works better and more reliable. What I think is the great part here is that there are all these other components that built on the API of Network Manager. That is of course the UI, but also for example, there's cockpit and there is integration with Ansible. And we want to do more with OpenShift and these layered products. So I think in our team we see Network Manager as this part that provides the API for configuring the network. And I'm happy about this position. I think it's right that there is one component that provides it. I mean, it provides something on top of what the kernel directly does because in kernel you just configure the current networking interfaces and the IP addresses, but that's all just for the moment. But then you need ways to persist this. And for that, Network Manager provides an API that is based on profiles. And I think it needs to be one central API in the sense that different components use the same API so that when you configure something, say with cockpit or with Ansible, that it touches the same things. So, yes, that's what I enjoy. I think you're muted, Prarit. I'll speak a little bit about what's going on in some of the platform areas. As we all know, Intel is delayed at seven nanometer for a little while, so they're putting some of those features into and testing processors. Things like we're seeing with CPU frequency enhancements, things like speed select technology, which instead of just allowing an entire processor or an entire core to ramp up its frequency or decrease its frequency, we're now have the ability to have user selected cores rise to user selected frequencies. And it's even more than that. You can modify the heuristic to work the way you want it to. We're seeing some areas of work in SmartNix, getting REL onto both the offload CPU and getting the REL OS involved there. NVMe and storage, things like persistent memory. I know we've tossed around the NVMe phrase for years, but it's now becoming really much more of a thing that we're seeing even in lower end servers. There's a lot of effort around here about edge computing, going out to the edge and making sure REL and Linux runs securely in those areas. So there's a lot of security work going on there. We also, I noticed this in the title, it's called System Engineering and Hardware. I almost wish I had a moment to yell at Aaron and say, hey, change it to Systems Engineering and Platform. We're no longer just doing hardware here. We haven't been for, gosh, 10 years, I'd say. We are a virtualization company. We do deal with partner virtualization platforms. AWS, for example, runs some of their systems run REL. It's actually a big piece here at Red Hat to note that you can run in virtualized environments, containers, et cetera. So there's a lot going on in Platform and it's a constant change here. Just so that you don't stay mad at me, that Systems Engineering and Hardware is actually the name of the track, not the session. So it's not my fault. Oh, is it? Yeah, yeah, yeah. Who do I have to yell at to get them to change that next year? I don't know, someone, maybe Irvachi, yeah. I guess I can answer that question as well a little bit from the OVS side since we talked a little bit about it. One of the cool things from the more recent OVS releases is the support for like AFX-DP interrupt mode on AFX-DP ports. So that's kind of cool. You can actually integrate with the cool buzzword technology that the kernel exposes. It's actually quite neat. This is XDP interface for, or API. I don't really know what the best way of calling it is, but framework, yeah. So that's kind of neat from the OVS side. There's a question. How will SmartNix with an OS embedded enable new platforms workloads? So this is something that we are keeping track of in OpenShift. So OVN Kubernetes, the project to use OVN as a network plugin for Kubernetes and OpenShift was mostly started by people at NVIDIA who are, they're doing something with like giant clusters of GPUs or something, but they're really excited about SmartNix and have been working on support for that in OVN Kubernetes I think Valinox has cards that actually run OVS on the NIC entirely so that you can just offload all of your flow processing to the NIC rather than having to do any of it in CPU. So they've been making sure that OVN supports that, which I'm pretty sure it does like just the current code and get mastered. The way we deploy it in OpenShift right now that won't work, but we have a enhancement proposal open to get that working for 4.7, I think. Thomas, does network manager have any like special knobs or anything to integrate with any SmartNix? Not really. It just uses the kernel abstracts for the most part. So no, it doesn't, also because it doesn't configure like flows directly for that you would need OVS. A network manager can configure OpenVswitch, but it doesn't configure the flows itself, right? Yeah, I guess it just sets up bridges and ports. That's about it. So this question was, how this enables new platforms and workloads and I don't know exactly other than by being faster. Are there, so David asks, is there a strong relationship between OVS and Ansible like network manager? Not that I'm aware. I don't know that there's anything that, I don't know that there's any like Ansible module or plug-in or anything like that that is tightly coupled to OVS. There is a module, okay? Yeah, I know like for us, so the biggest thing that's configuring OVS right now is OVN and so a lot of the projects speak to OVN. So for instance, like OpenStack and OpenShift, they sort of program up OVN and then OVN goes and programs up OVS. So I don't know that there, I don't know what Ansible projects are doing. Okay, NM State supports OVS and has some Ansible support. Okay, cool. Does that include programming flows? Sorry, now I'm asking you questions. Just what NM supports, okay. Possible there are as many experts in the audience as there are for a panel. All right, here's a question since I don't see one posted yet. What is, what's a difficult systems or networking problem that you ran into that you were never able to solve? I guess it's like an interview question too, but like, why not? I wanna hear what Wayman's answer is. So everybody knows he's known as one of the best engineers at Red Hat and if there's a problem that he can't solve, I would really like to hear what his answer is. Well, I can't think of any at the moment. You were high enough, you have at least some way to either where I want it or possibly adjust it. I would not say that you are able to solve every problem that you have, but you just try your best to make the best out of it, okay, that is my philosophy, yeah. There's always something you can do, but it may not completely address the issue or address the problem, but at least give the customer some way to where I want it or handle it in some way. You know, Wayman, that's very profound. I think that a lot of our, a lot of engineers get stuck in the trap of I have to provide a solution and at times there may not be a solution. A perfect solution. And getting something that just works or at least works around a problem that a customer or partner may have is very important and to be able to articulate that this isn't a perfect solution is also very important. There are tons of cases at Red Hat that we deal with that are like that. In my case, it tends to a lot be due to broken firmware. No customer or partner likes to hear that their firmware is broken, but I have to say that a lot of times and that the only fix, the only real fix they can have is to update their firmware. For me, it is sometimes to find, to model a problem like networking configuration that should be, that should, well, you want to have an API to configure all the network, but there is quite a lot of diversity in the technologies and the work and then to abstract this so that you can both implement it and say, wow, this is nice and simple and also that the API of the result of it is actually useful and powerful at the same time. So this I think is pretty hard. Okay, this one is from Marcello to Thomas. How are things going in regards to network manager and network D? I think they are going well. The, well, for example, we don't cooperate so much, but for example, network manager consumes quite a lot of source code of system D to use the DHCP client from network D. Currently, system D doesn't provide this as a proper library, but instead we kind of fork paths of it. So it's, that is actually quite useful. I think we also, we didn't only use it, we also provided patches for system D, for this, for this part. Other than that, I think they, network manager and network D have a different focus and do different things. So both, I think both projects have their value, but are not directly, yeah. On REL, for example, we currently don't support system D network D. It's excluded on REL 8 because, well, I guess the answer is that we, that we want to focus our effort in one solution and that it's, that it would be better if we had one solution that works for a variety of scenarios instead of having two solutions that both have still their own downsides. Yes, I don't know if that answers the question, we do rely on a lot of system D components, right? I mean, we really like system D resolve D, which will be the default on Fedora 33 and newer. And we also use system D host name D. And of course on REL we run this system D service. So in general, we think system D does great things, but there is a function overlap with what network D does. All right, is there anything anyone's looking forward to in the future that they see coming up some new cool piece of technology? This is more of an old piece of technology, but we will now be supporting SCTP in OpenShift. So all of you who were looking forward to that. I'm quite excited about NM State. It's a new project, which, and I think it shows a lot of promise. There is also a presentation by Fernando about NM State on DevCon. I just don't know the time now. I'm actually, even though this isn't purely a systems engineering issue, and I mentioned it before, I'm really looking forward to Centstream. It's going to bring a lot of non-traditional partners to the table for Red Hat to deal with the hardware. So the question from Ali is, I have a modified Linux kernel that works great on QMU, but not on hardware. What tips might you have for testing Linux on hardware and debugging it there? Ali, hopefully you can just type your answers. Are you getting far enough to get output from the kernel or is it stuck in early boot? So I guess there's output coming. Turn it on, the application comes up. Okay. So when you boot it on hardware, one of the things you can do to find out what component, there's a couple of steps here. While running the application on this modified kernel at BARFs. I see. So it's not a kernel issue. Hi, can you hear me? Yep. I think I thought it was better if I just connected directly. So yeah, the application comes up. And while running it on this modified hardware at BARFs, I can see a couple of CPUs are stuck. So that gives me a problem after like 22 seconds. I can see a few page faults. But what I'm actually looking for is, the way I can debug it on QMU, I can connect GDB to it from the outside. I can take memory dumps and look at it through the crash utility. How do I do that on hardware? Or do you have any tips for doing those things? So there is KDump that we have on in all the REL related kernel, CentOS, Fedora, and REL. So when the system does panic, you should be able to collect KDump and there's a utility called crash, which you can use to run on that KDump. There's a lot of documentation about it on tons of websites. I think if you just do a search for how to debug the Fedora kernel with crash, you'll come across a couple of good hits about that. Yes, I've used crash before through QMU memory dumps, yes. Can I ask the nature of the changes you made? So this is regarding the Unikernel Linux project, which I'll be presenting today as well. So this is basically the application linked in directly with, and G-Lipsy linked in directly with the kernel. So everything runs in ring zero. And when this application comes up, it starts running. It's a memcache de-application. When we increase the load on the system, then I see all these behaviors that I do not see on QMU. So the way I've been debugging it on QMU is I can see all the symbols, all the application symbols, G-Lipsy symbols, kernel symbols on QMU. I can take memory dumps and look at it through crash utility. But on hardware, I'm just starting out and there's no place for me actually to start. I see, trying to think of what else I could give you. May I suggest something? The Linux kernel is set a lot of debug capability. If you are on the hardware and see what sort of debug message that it gave you, it may give you some hints on what the problem is. For instance, in that we ship two set of kernels, we have a regular production kernel, this is for performance, and then we have a debug kernel that provide a lot more debugging-related message to allow you to figure out if there's something wrong with the system or the hardware. And also one thing is, well, QMU, they provide you only a set of very simplified MLA hardware. On real hardware, you have a lot more devices and stuff that you won't hit in the QMU environment. That may be reason why you are having problem with putting on the real hardware. That is something that the QMU didn't emulate. So you won't see it when you put on QMU. Great, I understand. All right, so Irvashi let us know that there is a breakout room for anyone that wants to continue the discussion. Thank you all the panelists and thank you all the participants. Yeah, thank you all so much for the session. I definitely learned a lot. As Erin mentioned, we have breakout rooms under the expert tab for each track. If you want to continue conversations there, please feel free to go there. You can share audio and video. We have more sessions coming up in this track from 12.50 p.m. onwards. So we'll see you back here at that time. It's break time now. Thank you all.