 I welcome you to this session on large virtual address support 52 in arm 64 colonel. So we are going to talk about the relevant implementation details and discuss some pain points. So, hi, I'm a patient. I currently work with Leonardo. I hope you are taking good care of yourself during the pandemic. You know, usually large address support is a very frequently talked about topic, but you know, it can become boring quite quickly. So I tried to, you know, make this talk a bit, you know, in a sense condensed, but not so condensed that, you know, it becomes boring for you. So I did a similar talk on the same topic in Alexcon Australia a month back also virtually and this is actually an update on that on the same talk. Basically, I tried to gather all the feedback that the people had to share there and then I just incorporated a few changes in the talk accordingly. You can find open source article I have written for the open source magazine on this topic also. So probably that will also provide you some more details. So in this talk, I'll try to talk mainly, you know, in terms of the large address support on arm 64. I'll try to, you know, demystify a few details. So let's open up the mystery box a little bit by bit. So first, you know, a bit about myself. So I work currently with the narrow as part of the landing team. So this is my snack from the last in person conference I attended in 2019, you know, when you could travel freely. Yeah, so missing those days, right? So we need to do this and let us go and trust me, I contribute to, you know, several upstream open source projects like Linux, if I have you put both orders. Recently, I've also started contributing to user space utilities like exit tools, make them file. I also committed the crash utility tool upstream that's the user space to. So, all right, so let's see what we are going to talk about today, right? First, we are going to talk about the large virtual address support for what and the house so the basics. Then we are going to discuss a bit about the existing kernel memory layout on arm 64. Thereafter, we will discuss about flipping the arm 64 kernel memory map. So for accommodating the increase memory range. So I'm going to talk about what happens with this flipping in detail. Then we will talk about something very important, you know, what happens to user space. So what happens mainly to the existing user space applications which expect pointers, let's say, from the existing 48 bit address range from the kernel. How do they keep on working? Also, we will discuss about some of the user space applications that were broken after these, you know, curly map changes. Thereafter, we will discuss how a user space application can explicitly request addresses from the 52 bit kernel range. Later on, I'll talk a bit about how to test the V support, especially if you don't have a real arm 64 hardware, how do you test, you know, these changes that they support. And at the very end, I'll try to share some suggestions and suggest some mixed steps. Right, so let's begin. Okay, so I hope all of you here have heard about 64 hardware and you know how it is moved very quickly from good to have to at the minimum required feature for various computing use cases. So you can think of edge routers all the way up to servers, right? So you can think about supercomputers, you know, just at the top most end of the spectrum. So you can think of seven such existing use cases, right? So 64 bit hardware allows you to, well, in theory, it allows you to connect to up to 16 XB bytes of memory is quite huge, right? So we recently saw a server with 64 to be by memory connected to it. And this was a x86 machine, but you know, we are going to see more and more such servers coming up on other architectures will similarly for arm. So we have more and more use cases coming up that require addressing the ranges larger than what is normally allowed by 48 bit virtual addressing CP, right? So do note that there are still some limitations still, right? Not all the processors support the full 64 bit virtual or physical address space. So we'll talk about that more in the upcoming slides. Also, I do know that I'll focus this talk mainly on arm 64 architecture and mainly talk only about the virtual address support requirements, you know, just to save on time. So similar discussion applies to the physical address space requirements for arm 64, the increased virtual address space when we talk about physical address space, but I'll mainly concentrate this talk on arm 64 and the virtual address space. So let's see what happens further. Alright, so in the previous slides, we mainly talk about the architecture limitation in the CPU design for addressing the complete 64 bit memory map. So let's see how, you know, to really used architectures, champions in their own right, right? Fair in terms of the virtual addressing capabilities. So into X86 64 bit, basically introduce the five level page table support in both hardware and software in its 10th generation isolate course. So course like I3, I5, I7, these support five level page tables, which allow them to address up to 57 bit virtual address space. Note that this bumps the possible virtual addressing space all the way up to 128 PB bytes. And the physical addressing space also gets a bump and it is bump to four PB bytes, quite a leap, right? So on the signal lines, arm 64 introduced to new architecture extensions. So these are called the 15 bit addressing extensions, mainly the arm VA dot to LVA and arm VA dot to LPA, where LVA stands for large virtual addressing and LPA stands for large physical addressing. So these are actually part of the arm weight application profile architecture version 8.2. So it is expected that the arm 64 course with these extensions will allow you to address all the way up to four PB byte of virtual and physical address space. Again, quite a big jump as compared to the earlier maximum support of 256 terabytes that was achievable with the maximum 48 bit address support that was available earlier. So we saw that both x86 and arm 64, they have increased the virtual and physical address support that is available. And you can basically try to use these new features to develop applications that require addressing larger around range. Okay, so just a quick recap of the previous slide from arm 64 perspective. And so we got to know that the arm VA dot to application processors mean from the cortex a family, such as the a 55 a 75 a 76 these provide two extensions, either called LVA and LPA extensions respectively. And the CPUs with these extensions provide you to add a capability to address 52 bit address ranges, which is quite a big jump from the earlier support that was available with the 48 bit address space. So this makes these on course, you know, possible candidates for high end applications like servers or even supercomputers, you know, use cases like that, which also require a low power profile. It's basically a USP for USP for answers to for course. So if you want to address larger address ranges, as well as you know, consume less power doing that, probably, you know, using this on course make mostly free. Then, let's, you know, talk a bit about address spaces, you know, in the previous slide, we basically try to talk about 48 bit and 52 bit address spaces. But basically what happens in the background, when a CPU tries to read from or write to an address. So let's see what happens there. I'm sure folks here know a lot about MMU hardware already. So I'm not going to take a deep dive, you know, explaining the role of MMU and how it separates the virtual and physical address space. But let's see a simple example, you know, how to, to understand how a hardware model works when a CPU issues a virtual address and it needs to find an equivalent physical address for that same. So here I'm taking an example that is specifically suited for ARM64 architecture. But the basic underlying principle is valid for other architectures also. So when an ARM64 processor issues a virtual address for an instruction fetch or a data access, the MMU hardware basically translates this virtual address to the corresponding physical address. And this happens using a phenomena which is called the translation table walk. So usually a translation table walk comprises of one or more translation table lookups. So in this example, I'm going to show three translation table levels. Level one is the top most level and level three is the lowest. So in each translation table lookup, you basically get a descriptor as a return, which indicates one of the following either the entry is the final entry of the walk. In this case, the entry contains the physical address and the associated permissions and attributes for this access or an additional level of lookup is required. In this case, the entry contains the translation table base address for the next lookup. So you can see this from the figure that I've just shared in the slide that there are three level page tables. Level one is the highest one and level three is the lowest one. So each descriptor is either pointing to physical address or is pointing to the base address of the next level lookup table. So now in the next slide, let's look at what happens from a software perspective, particularly how the latest current supports these translation tables. So on this slide, we'll try and see how things mainly happen from a Linux point of view. So the ARM64 architecture currently supports the base sizes of 4K, 16K and 64K. So of these, the normally used base sizes are 4K and 64K. So 4K is mainly used for the embedded profiles whereas the 64K is used for the server profiles. Let's look at the virtual addressing ranges supported when 4K base size is used. So first is the 39-bit addressing range, which allows you up to three translation table levels. The second is the 48-bit addressing range, which allows you up to four translation table levels. So you will refer to the ARM64 memory documentation page that resides inside the kernel documentation for more details. Furthermore, continuing on the Linux details. So for the 64K base size, which is normally used for the server profile, let's look at the virtual addressing ranges supported. So first is the 42-bit addressing range, which allows you two translation table levels. The second is the 52-bit addressing range, which allows three translation table levels. Note that the translation table levels for 64K and 52-bit addressing versus the 4K base size and the 48-bit addressing are intentionally kept the same. To minimize the effort that is required for moving from 48-bits all the way up to 52-bit address space. Just the number of the descriptors in the first translation level are expanded for the 52-bit addressing. Again, you can refer to the ARM64 memory documentation page that resides in the kernel documentation for more details. So in my view, the best way to understand the 52-bit address space on ARM64 is to have a look at an example, right? So we saw in the previous slide that the 52-bit support is only available with a base size of 64K and it requires three levels of page tables. So let's see how the three level kernel page tables. The topmost one is the page global directory, PGD. Then the next one is called page middle directory, PLD. And the last one, PTE, which is called the page table entries, fit into the picture here. Let's start from our earlier premise. The core issues are 64-bit virtual address and we want to find out an equivalent physical address for the same on ARM64. So on the extreme left of the picture, you can see the TTBR select bit, which is bit number 63. So it selects basically between the kernel space and user space addresses. So TTBRX holds basically the base address of the level one page table. Now it's 51 down to 42 in the incoming virtual address. Tell us about the PGD index in the level one table. Similarly, bits 41 down to 29 in the incoming virtual address. Tell us about the PMD index in the level two table. And lastly, bits 28 down to 16 in the incoming virtual address. Tell us about the PTE index in the level three table. So finally, the derived PTE value and the lowest 16 bits of the virtual address are combined to determine the final physical address value. So I hope this example makes things more clear. So as you can see in the figure, we have three page tables, level one, level two, level three, and certain bands of the bits of the incoming virtual addresses are used to index into these tables. I hope, you know, this makes things more clear. Again, feel free to refer to the ARM with documentation from ARM, mainly regarding the address translation for more details. Right. So we come to an interesting aspect here. So how do we decide the kernel design approach to support both the older ARM 64 CPUs, which don't support the 50-bit extensions and also support the newer ones, which do. So in the kernel, we have selected an approach to keep a single binary and make a decision at the early boot time to check if the underlying hardware is supposed to 50-bit extensions or not. So if it does, the kernel uses a 50-bit virtual addressing mode. Otherwise, it falls back to the default 48-bit or low virtual addressing mode. So let's take an example of two platforms. The Ampere EMAC ARM 64-volt station, which doesn't have the support for the new extensions, and the new footage to FX700, which claims support for the V8.2 extensions. So, I'm sorry. So, as you can see in the figure here, that, you know, the figure demonstrates that the decision about the addressing range to support is made at the early boot time. And accordingly, either the 48-bit or the 50-bit addressing range is used. Again, you can refer to the ARM 64 memory documentation page inside the kernel documentation for more details. Fair enough. So on this slide, we'll just quickly look at the kernel variables, which the kernel code uses. Mainly to make sure that it handles both the 48-bit virtual address as well as the 50-bit virtual address width. And also, it is able to make an early boot time decision to switch from 48-bit to the 52-bit virtual address space if required. So it uses mainly three variables. There are others as well, but, you know, just not very useful for this discussion. So let's discuss about these three variables first. These are the V8 bits, V8 bits minimum, and V8 bits actual. So the V8 bits is actually denoting the maximum size of the virtual address space. The V8 bits minimum is denoting the minimum size of the virtual address space, whereas V8 bits actual denotes the actual size of the V8 space. So if you really want to check the V8 space range supported by your running ARM 64 kernel, you should look at the value reported by the variable V8 bits actual. So let's move further. So do we need to keep something additional in mind when we talk about the increased memory map support in ARM 64? Surely. We do need to keep in mind that, you know, the kernel memory layout was flipped starting from kernel version 5.4 to allow a larger virtual address space support. Let's look at the figures on the slide to understand this better. By default, with up to 48 bit virtual address support, the ARM 64 kernel map looks something shown on the left side. The direct linear range was, you know, mapped at the farthest end of the memory range, whereas the kernel text range was kept near the newer address ranges. For the 52 bit support, we flipped basically the kernel memory map, but we decided to keep the kernel text addresses same as earlier. So the direct linear map range goes from FFFFFFF all zeros address. And whereas after a gap of the kernel address sanitize with the kaasan, the kernel text and other addressing ranges, they are kept. So this flipped kernel map actually got some interesting problems in the user space. So let's look at them in the next slide. Right then. So what happens in the user space then, unfortunately, due to the flip in the kernel memory map, a few user space applications, especially the ones that are used to debug live kernel or which are used to analyze the VM code dumps. They get broken. I just wanted to let you know that there are five minutes left so far we don't have questions. So, okay. Sure. Okay, so unfortunately, because of this flip, we actually have some use this applications broken. The main reason for that is that these applications also need to perform a virtual address to physical address conversion. So you can see that some applications like exit tools crash utility and make them files. These were broken after the changes starting from kernel which is 5.4. So I have proposed some fixes for these utilities, while some have been accepted others are still pending some discussion. So you can, you know, look at these details using the hyperlinks on the slides and just have a look, you know, just to understand what happens. A million dollar portion coming up right now in your mind is what would be, you know, what happens when, you know, we use other applications which are not used for debugging kernel per se. What happens to these existing applications to the break because they are expecting your address from the kernel which is 5048 bit. So the answer is no, because we decided to keep an optimal model for the user space applications that is the kernel by default will return address from the 48 bit range. Right. And if the user space applications really want a pointer from the filter with range, they can explicitly pass a hint, let's say in the map call to request a address in that range. So the obvious question that would pop in your mind now is how can I test the 52 bit, what you let us support, especially if you don't have any real arm 64 hardware. You need not worry simulation is actually a friend here. So first option is using quemu. So you can basically use quemu I have used an example of running the quemu arm 64 guest on my federal rights it is its host here. And then you can use the word builder tool. Here I'm creating a federal 30 arm 64 guest. Then you can also see him using quemu. The second option is to use the arm be it fast simulator model, which can be downloaded freely from the website, you can get more information about these models from the arm website I have added link for that in my slides. You can exercise Linux Debian or Fedora images on that same. So you can see the screen shot of the simulator model running a Fedora 30 arm 64 image in slide below. Right then. So what are some of the pain points in the next step, the first step is to fix the broken debugging our related user space applications, which is a work in progress, I hope these will be fixed very soon. So it should be, you know, just to make yourself and others aware of the flip colonel at this map after the kind of 5.4. So this might, you know, cause some issues so just be aware of that. Also, newer or willing application owners can give the 52 bit addressing a try, they can pass for example a hint specific into the map called just to see that they can address get an address that is in the 52 bit range from the future. The next is expect more changes around this feature in the near future because it's still getting stabilized. So, for example, has already, you know, most of that said to extend the direct linear map range for the 52 bit configurations. So towards them, I would just say test the upstream kernel if possible, and you know report issues if there are any upstream so we have a question from David. I think discussion requirement tools like tech sex, it is related directly to the memory mapping or other related issues. So yeah, could you repeat that? Sure. Other pending discussion requirement like sex related directly to the memory mapping or other related issues. So mainly it's memory mapping. So there were some user space applications that were broken. So I think these should be fixed pretty soon. Also, the 52 bit hardware is just, you know, coming up. We have started just testing that mainly the vehicle available was a simulator model. Now we are seeing a few hardware coming up that that support these extensions. So the discussions are what mainly, you know, centered around what's broken and what can be addressed through the new flipped kernel memory mapping range. But yeah, I'll expect that there would be more, you know, more discussions and more changes. So just keep an eye on the 64 memory dot txt. I think it would be changed a bit further in the coming days.