 screen. Rest of the talk, we will not like this. So it will be, there will be some code slides on the screen and we will do some basic stuff. But before we look at that, who knows about DPDK, rare information that we have times? Okay, thank you. And is there anybody already writing DPDK codes? Okay, you guys wrote a lot, sorry. So I think it was to make some middle introduction about DPDK. DPDK, Data Plane Development Kit is, of course, an open source project. It's an FD component and it supports multi-rember NIC and it runs on multiple architecture and it solves one problem, mainly. That's the problem. The problem is packet sizes and for different speeds, 10 gigabits, 1400, it requires a number of packets to hit line rates. If you talk about small packets, 64 bytes, the amount of packets is a lot to hit the line rates. So there's a middle calculation below that for 64 bytes. It is for 4G network. A number of packets, almost 60 million. So in each 16 nanoseconds, you have a new packet. So if you are doing forwarding or doing something, you need to handle that packet in that time frame. And if you have a 2Ghz CPU, that means only 33 cycles. So that's 2 less. And on the graph, there is some notes. For example, if you are hitting a 3 cache, you are still in cache, but last level cache, it takes something 40 cycles. And if you miss the, if CPU missed the cache and goes to the memory, it is around 17 nanoseconds. So if you are cache-less, your CPU going to memory and fetching the data, you already lost a bunch of packets. So you can also make line rates. So that's the DPDK, that's what DPDK targets. With DPDK, you can have line rate performance. So how it is done is based on some software concepts. Benefiting architecture splatter. So it is not a regular software, mostly lower level. And I won't tell you all of them, but these two pages are basically what DPDK does. There is nothing new here, but all combined together to achieve this. Ryan, I won't spend much time on this. Overall, DPDK is a framework, so it's a set of libraries. And these works show different part of libraries. So what we will write. It will be a very simple application. So DPDK is port based, managed in a port manner. So there will be two ports, a packet traffic generator, which we won't be able to see right now, but see in the slides. We will forward between two ports. That's all. Here, with perhaps, before end of this presentation at least, so perhaps in 20 minutes, with a hundredish line of codes, we will able to hit line rates packet forward. So I think that will be amazing. Before DPDK, it requires some environments, user relations setups. I won't spend time here to describe them. But very quickly, it uses rich base. So rich base needs to be set up. Also, DPDK needs, not kernel drivers, but UIO based drivers to be managed, managed the network card. So this is a tool by DPDK, DPDK DevBind. And it shows here two already derived there one by UIO driver. So this is basic make file. And this is all we use as a make file because DPDK make system supports external applications. Just include them and set our target and search file. As you can see here, we need to do set RTSDK variable. So these are just sample to sort what we are doing. I will not talk now. We will start with this one. From this one to our forwarding application, step by step. In this kind of supreme, left side is the code we are adding in a invite this format. And right side is what happens when we run that code. So we will do chains and we will see the output. And we will start with an API code. This is the most basic and most perhaps important one, AL initialization. AL is environmental abstraction layer. And it does all heavy work. It initializes hardware. It's on top of hardware. So what it does, initializes memory, initializes course, and initializes links. First scans the system for network apps. Then points them, props them to the user space at the net drivers. It gets arguments, program arguments. You can pass them to AL. None of these simple applications just pass them directly. But in a different application, you want to use some of them for application itself and pass other to AL. And when you run the application, run to anything useful in this stage. But as you can see, it forms some devices. Now your application is aware of some PCI network devices. But right now we are not doing anything with them. Next stage, although all network interfaces are propped, you don't have anything about them yet in our application. So there are a set of APIs that helps us learn more about what we have. So one of them, simple one is, is it easy to see the port from there? Okay, that's a problem. But it looks like front side's better, not the back. Because there is a complaint from the same. Okay, I will try to explain this one. RTE, ETH, dev count. It will give the number of the attainment ports right now your applications know about. And the simple application requires at least two should be. And it will be times of twos. Because we will do forward and between them. Again, when we run it, there is nothing special there. This is just to show DPDK as well as has some APIs to control the attainment device. It also has some set of helper functions like this RTLop one. And there are many, if you are developing a DPDK application, you may go discover them and use them. And the next thing, that's also something important and perhaps can be new. RT packet and both pull crates. So to describe what it is, I need to describe what mempool concept is in DPDK. Since DPDK manages the memory, it has to manage the memory to see the numbers we have seen. Because we don't want any TLDs. So we cannot rely on, we cannot directly use operating system memory allocation APIs. So that's why DPDK use which base to reduce to TLDness for space. And doing AAN initiation, it does a few things. First finds the continuous memory. Then does a non-zones, creates memory segments. And on top of this, there is a mempool concept which is indeed fixed size of memory managed by a ring. So instead of reallocating the allocating code, we are getting some memory information from that ring and putting back all the fixed size. And this is a specialized version of it for packet and buff. And again, and buff is like Linux, SK buff or PDSD, and buff. A data structure, but for DPDK's automated data and data, a data structure to point a network packet. So when your network card receives something and that's stored to memory, that's, and buff is your reference to point them. So we are creating here a set of emboss. We need to do this, and we need to provide this information to the user space driver to manage it. So we are also doing here memory management parts, but with a simple we have some amount of mempool created here. So keep going. Next thing is managing the network cards. Okay, we have some handles for them, so we can access their driver, but we need to do a few things. This is just preparation for port installation. The main thing we go here, nothing here right now. We start doing it. First thing we need to do, this is a mandatory step, configure. This may not be mandatory step, thanks Thomas. To configure the ATINET device. It is ATINET configure. I will just switch here to browser to show API documentation is already online. If you are doing something, you can find it. And for example that, okay, it gets configured port ID, Rx screws, Tx screws, and a configuration strut which has these stuff. There are different Rx mode configurations here. In this sample, it does nothing, but always you can go and refer to API documentation. This application uses only one queue for the NIC, but we may use multiple queues. Nowadays NIC supports multiple queues. And are we familiar with why we want to use multiple queues, for example? Anybody who knows why we want to use multiple queues? Again please. For performance, because in dbtk, each queue is a memory location. NIC supports its data into queues and we are using a call to process that data. Again, we are very limited for time. So if you want to process more, what we can do is put data into multiple memory locations and use different physical CPU cores to process them. So we parallelize them. But how can we put network data into multiple queues, using multiple, configuring our network to support multiple queues? Traffic can be directed to different queues with different ways like RSS or any flow director methods. And you can use multiple cores to process them and have better performance. Right now this one is single one queue. Next thing for each queue, we are configuring Rx and Tx queues. For Rx, we need to also provide member queue so that driver can use that one. Also a number of the rings can be configured here. Number of the rings is the size of the memory that NIC use to put the packets in, you can say. Also in most of APIs in dbtk gets socket ID as the argument. If you are in a remote system, if you have multiple sockets, that matters which socket you are allocating your memory from and which cores you are using for processing. As we talk in the beginning, just to stress dbtk is mostly about performance. So that's why we are in that level to say which CPU to use for which kind of queue, which queue. We configured Rx and Tx queues. Next thing is simple, start the port. So these are what we use for port utilization. Configure it, configure Rx, Tx and start the port. Nothing new when we run the application, but I can see some new argument that is dash w. Since we pass all arguments to AEL, AEL already supports bunch of arguments. One of them is dash w, which whitelist PCI devices. So it seems my system has many devices, but I don't want dbtk application to lose all of them. So that's the way of saying queues only descend this. So there's a whitelist platform that's already supported there. Next step, we are enabling promise customer of the leak. That's mainly to show you can set of APIs that you can use to configure network cut. This is a sample that we have lost of them. And towards the top, we will configure the cores because dbtk also can manage the cores again for the performance. This piece of code doesn't do anything special right now. There is the API RT, AEL, and P remote launch. And this is the CQIP master. So it will run this application for each course except the master. The master is the first core that dbtk knows and by default is zero. And if you don't give any parameter to dbtk, it will use whole cores in your system. If you have, if you are using the third core system, so dbtk will use all of them and you can set which core to what? AEL and P radar core to wait for them to finish. And in this step, hard-codedly sets the which core to use, which is forward and core one. So in this loop, if it is, if forward and core is running, it will do end this loop. The rest will just print and exit. So we have another set of APIs to manage the cores. And in this sample, for example, instead of saying application use all cores, dash dash all core parameter sets dbtk application only use this course. In example, it is zero and two. And since none of them are forward and core applications just exit. And if you say application, please use course zero and one. One will be in end this loop. So we will have to control set of break the application. So this gives flexibility use to cores. And next thing, we will start now receiving packages. And it is mainly done by RTATH, RxPERS. dbtk use burst receiving transmit because you cannot get that numbers with getting transmitting packets one by one. So burst is one of the methods mentioned and used. So we are getting burst amount of buffers. And if we are not getting, for the sample, we are not doing anything. So we are freeing all of them. You have to free all of them. These end-ups are coming from the mempool we allocated. So it's a limited resource. If you don't free them, you won't stop. Functionally, things starts. We are transmitting packets to the other port. There's a basic logic here. This one, and mainly provided by here. So we received, we received embossed transmit to other port. This thing, I think, a little more than 100 lines. Now, not very functional, but can do forwarding between two ports in a tangent of the line at least. Another just small feature, just to show how we can do, this application will swap MAC addresses between send and received packages. This is just preparation. And this is how it is done. The API RT packet M2D embossed to data. This gets the data part of the emboss. So this is the actual data that network packet contains. And we know it's a Ethernet frame costing direct to Ethernet and so on. So this will cause performance to occur a little. Still most probably it will be line-rate, but since you are reading email data and it is not in cache line, so it needs to be low and grab from the memory to the cache. So this will be the lower performance. These are just for adding a few features. We didn't check if link is up or down in the sample application. Here, the API RT, ATH link helps us get this information. Normally, DPDK normally use pole-mode drivers. So as call goes and continuously pulls the queue, if there's any packets or not. It's not using interrupts again to keep the performance numbers. But there's an option to use interrupts. And for link status, it is possible to use link steps in interrupts too. Adding a signal handler, there's nothing specific to DPDK here. We will use it just to print status. This is how we get the stats, RT, ATH stats. There are two sets of stats in DPDK extended statistics which a pole-mode driver provides. Drivers specific, link specific, stats. And this is the genetic one. Genetic ones are quoted by all drivers. And it will print very basically here. We can see number of the packets received, transmitted and stored. That's all I have. There are references. And in DPDK's sample repository, there is a sample application folder which has from this level to very high level, almost production level applications. If you're interested in writing some DPDK applications, that's a useful resource. And there are many applications there. In IRC, there are always some people if you want to ask. There are a bunch of applications already benefitting from DPDK. These are just the one side. I'm sure there are more. Thank you very much. We have five more minutes if you have any questions. Almost five. Again, please. Is there any other libraries available is the question? I'm not sure. And it has kind of has to be C because it's a solo level in some parts. Okay, in a higher level interface, no right now, which is only C. How big is a library in big environments, binary size? A few megabytes, that's not huge. Overall, it's not a very big library. It's reasonable one, I think. Okay. Thank you very much, everyone.