 I'm Miquel, welcome everyone. I work for Bootlin, which is formally a free electrons. I've been quite active recently in the non-subsystem, that's why I'm going to show you today the new interface that has been merged recently to handle non-flash. I live in Toulouse in France, so please excuse my French accent. I will try to do my best. So I decided to start, actually I wanted to show you the stack in Linux, but while I was writing the slides, I found it was kind of boring to spend an hour on that. So I decided to add a bigger part about the physical aspects of non-memories. But then I will explain how non-flashes are linked to your SOC and how you can control them using this interface XACOP. So before starting, I'm not a non-expert. I will probably simplify aspects about the physical part, and I will focus on SLCNAND, which stands for single-level cell. Today there are other technologies, MLC or TLC, that can handle more than one bit per memory cell. But to simplify things, I will stick to SLC today. So before the technical aspects, just some commercial information. NANDs were once designed to replace hard disk drives. You can find them, they are widely spread in consumer electronics like USB sticks, SSDs, SD cards and so on. And they come under different flavors. So the one I just talked about are managed NANDs. You don't see it's a flash, it's a NAND inside, you just see it as a way of storing data. I will speak about row NANDs only, also called parallel NANDs. So let's start with the explanation of how a non-memory cell works. Well I'll start deep into the matter with the silicon atom you can see on the screen. It has 14 electrons and also 14 protons. The electrons are spread across three orbits. The external one is called the valence shell, where are the valence electrons. Each of these four valence electrons will bind with another valence electron of another atom and one silicon atom will bind with four other silicon atoms. It makes the crystal. The crystal is electrically neutral and you won't have any electricity in it unless at least it would be a complete insulator at zero Kelvin. But in our world, if for instance light strikes an electron on the valence shell, it will absorb a quantum of energy and jump to the upper orbit, the conduction band. Here it will drift randomly in the matter and if you would apply a voltage across the matter, it will create a yield of electricity. So to make this state permanent, people invented doping. Doping is about adding impurities in the silicon crystal. I mean if you add other atoms than silicon in a pure crystal, if these atoms have either one more or one less electrons on the valence shell, you will have either one electron that will be free or you will lack an electron on the binding atoms. So it's endoping when you have one more electron because it's a negative charge that moves and pedoping for positive when you have a hole. That means your atom is only bound to three other silicon atoms and four, actually four and one of these atoms will have a free electron which aims to bind with another one. That cannot. If you put side by side these two regions, N and P regions, you create a diode. Electrons close to the junction will jump into the P substrate and combine into holes in the P substrate close to this junction. It makes this area of the P substrate electrically not natural and same on the other side and it creates a small electric field that will prevent other electrons to combine with other holes a bit further. If you apply a voltage across the diode, well if you apply a positive voltage on the N side, electrons will be attracted but you won't have any current. However, if you apply a positive voltage on the P side, the electrons that were close to the junction will jump from hole to hole until they get out of the circuit, freeing the holes close to this junction, letting other electrons from the N substrate to jump across that barrier and there you have a current. So this is the basic of the MOSFET, the transistor which is made of in the center, it's a metal oxide semiconductor area because you have one leg which is conductive and an insulator which is the oxide and then the P substrate, the semiconductor. If you apply a voltage across the external legs, you won't have any current. But if you also apply a positive voltage on the gate which is the leg in the middle, positive charges will go on the gate and will repeal other positive charges in the bulk, letting a small channel, letting the electrons from one N side going through the other N side, jumping over the P substrate through a thin channel. So this is the basic of transistor but we'll agree that you cannot store data with that. That's why people added an extra floating gate. This extra floating gate is surrounded by an insulator, still the same oxide. And if you do the same thing as before, I mean applying a voltage across the external legs and also a positive voltage on the gate, you'll still have your thin channel of electrons moving from one N side to the other. However, if you have a lot of electrons in the floating gate, you'll have a big amount of negative charges that will attract the holes from the P substrate and kind of create a big positive barrier that electrons could not jump over anymore. And this way, you actually have a zero because you don't have any current anymore through the MOSFET, through the transistor. So when there is no current, we call that a zero and if I go back one slide, when you have a current that is flowing through the transistor, it's a one. So what you have to ask right now is, okay, but the floating gate is surrounded by an insulator, so how do you put charges into the floating gate? And this is quantum mechanics. It's called the Foller-Nordheim Tunneling Effect. It's when you apply a very high positive voltage on the gate that will attract and help electrons tunnel through the oxide layer until they get into the floating gate. Please notice that the oxide on the top is a bit thicker than the one at the bottom, so electrons could jump from the substrate into the floating gate, but not from the floating gate to the metal gate. That's how you program a cell to a zero state. In the other way around, if you apply a high negative voltage on the gate, putting a lot of electrons on the gate will repeal the electrons that are trapped into the floating gate back into the substrate, and that's how you erase the cell. So this is a much simpler view of the exact same transistor. You still have the floating gate in the middle. If I put two cells like that side by side, I should have an NPN, then another NPN region on my substrate. But instead of doing that and adding a wire between two N regions, what a hardware designer did is to only take one piece of substrate and dope N regions at regular intervals so that your layout is much smaller and you can have much more cells on the same silicon area. Just a side note, when you have like these two cells side by side in series, if you want a logic zero on the right side, you have to apply a logic one on both gates so that this point would be at the ground. This makes, if you put a one and a one, then you get a zero. This is the non-gate, right? That's why we call this kind of memory a non-memory cell. So two cells is good, but not good enough to store actual data. So what we do is we create strings of non-memory cells in series, of course. You can go up to maybe 64 cells in series. The only thing is to cross the diode, you must apply about 0.6, 0.7 volts across it. So the more cells you put, the more higher voltage you will need to make current passing through all the transistors. So how do we read one cell, one bit from this string? Just applying a positive voltage on the gate as we've done before is not enough because if the other transistors in the same string are not passing, you won't have any current anyway. So what we do is applying an higher voltage on the other gates, not that high otherwise you would produce the tuneling effect, that's not what we want, but high enough so that these transistors will be passing anyway, no matter if they are charged or not in the floating gate. And that's how you can read one cell from a string. If you put a lot of cells in parallel, there you have absolutely no limitations on that. You get what we call with our terms a block. For those who are quite used to manage non-flashes, it's called, you know the terms blocks and pages. A page is actually a row of cells connected by their gate. So when you want to select one bit, you actually select the whole page. That's why you can only program and read one page at a time. I lied to you a bit before actually to erase a cell, you do not apply a high negative voltage on the gate because it's kind of difficult to create in a mid-existent and we already need a high positive voltage. So instead we apply the high positive voltage on the bulk. It has the same effect of attracting electrons back into the substrate. However, the bulk is shared across all the cells in one block. That's why non-memories, when you want to erase non-cells, you have to do it blocks per block. So this is a bit to sum up. You cannot program a cell to a one state. You can only program it to a zero state and if you want to erase a cell, you have to erase a wall block and so you have to erase the wall block before writing the page inside it. You can feel that this design is a bit fragile. Depending on the positive levels, you choose, you will have a following time effect which will be strong or weak. There are several flows in the designs and let me explain a few of them. So bit flips, everybody knows what it is. It's your writing, your writing a bit, you expect value and when you read it, actually you don't have the value you expected. If for instance the cell was not fully erased or programmed, it happens because some of the electrons that tunnel through the oxide won't get it to the floating gate and will get trapped into the insulator. This creates a small negative area that will rebuild some electrons to tunnels through the oxide and will prevent the cell to be programmed. There is also data retention issue. It means you write a page, you put on the side for a few months, maybe years and when you take it back, you don't read the data you wrote. That's because when tunneling, some electrons collided with the material and damaged it so it kind of creates some path between the floating gate and the P substrate. That's why with time, some electrons will get back into the substrate and you will lose the charges you put in the floating gate. And finally, obviously read and write disturbances. You remember the string where you have to apply high voltages on all the gates to read one cell. Of course tunneling is a stochastic effect. You cannot know if other electrons will enter the floating gates. You don't want to modify. So you can do this, read and write on erase cycles about 100,000 times for an SLC NAND. It's much less for ML C NAND actually. And that's all for the physical part. Now you know how, we know how a NAND cell works and let's think about just the NAND chip now and how you can wire it in your design. You do this for parallel NAND of course through the use of the NAND controller. The NAND controller is wired to your NAND chip, is wired to your NAND controller through a NIO bus which is either 8 or 16 bit wide and there are a lot of logic around it, around them. I will start with the lines at the top. The chip enabled line is here to select one chip. The host, the NAND controller will select a chip. Normally it could select a die because you can see die as logical NAND chips in one package but for example, let's say there is only one die in our chip. And the read deposit pin, it works in the other way around. It tells the host that the NAND chip is ready or not and maybe it is processing some commands and need more time. The write protect line is here to prevent any loss of data so the NAND chip won't accept any arrays nor program operations. I will go back on the last remaining lines. Just I want to show you the non-protocol, it's only three types of cycles that can happen on the iobus. The command cycles, the address cycles and the data cycles. Data can go obviously in both ways while address and commands are always sent by the host, the NAND controller. That's how we use these lines, CLE stands for command latch enable. It means a command is being sent by the host controller, by the NAND controller. ALE stands for address latch enable and read enable, write enable. The last two lines are here to inform who will talk on the bus if it's either the NAND controller or the NAND chip. So putting together NAND restrictions, you get NAND operation to achieve a real goal like reading something. These are examples. To read, let's start with the simplest one, the reset. If you want to reset the chip, it's just a matter of sending a command which is 30, no it's FF, sorry, and then wait for the NAND chip to be ready again and that's all. For the read page is a bit more complex, you'll have to send the zero command, one command cycle. Then a few address cycles. So the first command cycle tells the chip, okay, I want to read something. I will read the page, actually. Then you send it address cycles to tell it where you want to read. The 30 bytes, the second command cycle is here to tell the chip, to the chip, okay, now you can go into the actual NAND and bring the data into your local cache. At this moment, the chip will asset the busy pin, so you'll have to wait for the operation to finish and once this is finished, the host controller will asset the write enable line so the NAND chip will send the data through the bus to the NAND controller. These controllers come into a very large, there are a few, many flavors of them. Some are really simple, now they tend to be more complicated, more sophisticated. The main job is to talk to the NAND chip, but more and more they embed additional logic like error correction code, ECC, to handle directly the bit flips and also some advanced logic to enhance the throughput. So let's see now how it's handled in the Linux kernel. This is the MTD stack, memory technology device stack. You probably already heard about UBI and UBI FS, it's a file system level. I won't talk about it today. Your request will go through the MTD layer which abstracts the type of flash, it can be no or NAND, spy, row, anything. If it comes to NAND to parallel NAND, it will go through the NAND core also and then the NAND core will translate the instructions from the MTD layer into some understandable request to the controller drivers. So let's see how it was done until recently. We used a lot of hooks to achieve that, who has already been into the NAND subsystem here, please raise your hand just to, okay, only one person, two, three, okay. So these are the hooks that are usually implemented in the NAND core, the NAND controller drivers. Command func is the one in the NAND core layer. It was supposed to handle all the command and address cycles and it calls one hook from the controller driver which is called command control and send it each time one command or one address cycle and that's all. Other hooks from the controller drivers were used like wait foo or dev ready to wait for the NAND chip to be ready. Also various hooks are written write, byte, word or buff to retrieve or write data. But this approach has some limitations. The NAND controllers have become more and more complex and started, old ones could just send independent cycles, command, address or data cycles with no problem. New ones started to aggregate all of that in order to enhance the throughput for instance and well it's not a big issue to, these hooks cannot handle this kind of operation but that's not an issue, the NAND controller can still be driven by these hooks. But some of controllers started to not implement the possibility to send only one command or one address cycle. So developers started to overload command foo from the controller driver and now we have plenty of different implementations of that hook. That's a bit annoying because first when you have to reimplement something that should be in the core, that was in the core, you have a lot of situations to handle and people just supported their own use case. The logic, because it changes from one driver to the other one, cannot be changed as easily as we would like. NAND vendors still add new operations if we trust the Brazilian which is a current NAND maintainer. But we cannot add the support for these operations because it would be too much trouble handling all the different implementations of command foo. And most importantly, something which is really, really dangerous, drivers started to predict what the next move of the core would be because command foo does not provide the IO length. You cannot know from command foo how much data you will have to read or write. People started thinking what would be the length of the data move. It's a clear symptom that the non-core did not fit any more the needs. I'm pretty sure it fit the needs at the very first beginning, at the very beginning and controllers were quite simple, but not anymore. So that's why we introduced XACOP. This interface is in the non-core and like command foo and the other hooks before, it aims to translate MTD request into NAND operations. I truly believe that it will fit most of the non-controllers available today. It has been merged the 4.16 kernel and the Marvel non-controller is already converted. The FSMC12 and some other really close to be or it's on the road map. So how does it work? Well, from the controller driver point of view, you will receive an array of instructions that makes the overall operation. That's the difference with before. Your driver will have to split the operation into sub-operations if needed, if it cannot handle the wall block, and if it cannot handle at all the operation, it will return an error. It hasn't done before neither. So maybe in the near future, the non-core could take over and try with another operation to do the same thing. There are multiple ways to do the same things with the NAND protocol. So for simple controllers, it's just a matter of looking at each instruction and executing it one by one, and that's all. But for more complex controller, we introduced a parser. The parser is here to make the logic much more simple and let the non-core be clever, not the non-drivers. And the way to implement it is to fill an array of supported patterns. Each pattern is an array of instructions, non-instructions, and a callback. This is a simple example of what it could be. The first pattern is a command and can support a command up to five address cycles and up to one K of data move. The second one can handle either a command cycle and or a wait cycle. And the third one can only handle one K up to 1,024 bytes of data. So if, for instance, you want to reset your chip with this kind of driver, you will give to the non-core, to the parser, both the operations that were given to you by the non-core and the parser array, the patterns you support, so that the parser will go through all the supported patterns, find a match, and execute the callback, and that's all. And it makes the logic from the controller driver really, really simple. So in our case, the reset command could be handled directly by the second pattern. The second, the read ID could be handled by the first one, with no problem, even if there is only one address cycle, we don't care. We can handle up to five address cycles in this one, so it's okay. But the last one is a bit trickier. There is no pattern that will handle directly the whole operation, so the parser will split it into three shanks. The first one will be just the command cycle on both address cycles, and the first pattern will handle it. Then probably the second callback will be used to send just the second command cycle, and finally, you'll use the third callback to send the data, but it will be called twice because in this example, you can handle only one K of data at a time. So this is how Exegob is supposed to work. That's all for this interface, but I found interesting to give you a bit more, at least two other hooks, helpers, that you have to implement from your controller driver. So Exegob is one of them. But you also have the setup data interface, which is here to change timings from the controller side, because, of course, you can handle a non-chip at different speeds, and it's important that both your non-controller and your non-chip will run at the same speed. And the last one is select-chip, which is a way to select the non-die, actually, not the entire chip, and for simple controllers, it will be just the handling of the pin that selects the chip. But you can also maybe change the timings if you have multiple chips in parallel, if you are not using the same timings for all the chips. And yeah, that's all for the MTD stack. If some people want to help migrate these drivers, they are welcome. Just a few tips maybe you can use. You should use the user space utils, now not the one in the kernel. There are modules that do the same things, but they are deeply created and might be removed quite soon. And yeah, do not hesitate to read the documentation, even if there is almost none, or it's really, really, really old. So I would suggest you instead to contact us on the mailing list, and please do not forget the non-mentainer, it makes it mean a bad mood when it has to read the whole list. And that's all. If you are interested in the non-framework, I suggest you to have a look to the talk of Boris Brésilion in Berlin in 2016, and also at the talk of Arnaud van der Kapelle, which is about the physics of the non-memories, it goes much more in detail. It was the same year in Berlin. Yeah, thank you very much for your attention. If you have any questions, I'll be pleased to answer. Yeah. Oh, what kind of security did we implement in the parser? We do all the check in the non-core. So both in the parser and the functions that will call you functions from the non-controller driver. So you shouldn't have to worry about all of these errors from the controller driver. It has already been checked in before, both in the parser and in the non-read and non-functions that are in the core. Any other questions? Okay. Well, thank you.