 So I guess I might as well get started. This room is pretty full. I'm just going to steal a minute from you that you could spend sitting around. So I'm going to talk about a few things. First of all, what the heck is an attribute? Then how was it stored two years ago, and how has that changed in the last two years? Then I'm going to talk a little bit about the future, a little bit about some other performance tricks that we're dealing with in general. So there's going to be a section in the middle that's pretty technical with some C code, but I hope that I can keep it visual enough so that everyone can stay engaged. OK, so what is an attribute? It's sort of a terrible term because attribute is about the most vague word you could use in English. But really in Blender, we're talking about a bunch of data attached to a geometry. So these colors, they're really attached to each vertex in the mesh. And when painted in sculpt mode, I ruined this nice demo scene with my own art, especially for this Blender conference. So colors are attributes. Well, positions are also attributes. So where are the vertices in this mesh? Well, actually, whether they're selected or not is an attribute. So you can imagine the position is the x, y, and z. And that's an attribute together. And the selection is whether it's selected or not, a Boolean, true or false. And that's stored, in this case, for every single face. It can be a little more cryptic. So there's a subtle difference in half of this cylinder compared to the other half. And that's whether it's shaded flat. So you can see the normals on the left side are smooth. And that's the same sort of thing as a selection. Except it's stored more permanently. And that's just an attribute. The best way we have in Blender of seeing attributes is the spreadsheet. It's new-ish as part of the geometry nodes project. So I think on the very left side here, you see the sharpness attribute that we just saw on the previous slide. And then you see we're looking at the faces right now. So you see the material index and then some random user-created data. But they're stored in the same place. So this hard-coded data inside of Blender is the same concept as something that's user-created, just sometimes we don't show it. Like, for example, here, we don't show the selection because it's meant to be less permanent. And you can also create attributes in the property editor. They each have a name. And then the other two words are the domain. So whether they're stored on vertices or edges, and then the type. So those can get pretty advanced, like quaternions, or it can just be true or false, or 2D vectors. And then that will get even more complex in the future, possibly, with matrices and, well, let your imagination go. But attributes, yes, they're properties on meshes. But they're sort of meshes and curves and point clouds. But they end up being the basic data used everywhere by Blender. So whether you're animating or making a procedural geometry node setup to create molecules or rendering in cycles, the vast majority of the data we're interacting with is attributes. So it's really important how we store them. And I'd say it's the most important decision you make when you're developing a program like Blender. Where is the data? How is it accessed? And how does it live actually on the computer? And sometimes you interact with attributes with geometry nodes. I want to bring this up, especially, because it illustrates the way the user, see, on the bottom right, you see autosmooth. So that's a concept that, sort of for the user, doesn't need to be related to attributes at all. When I want something to be smooth in some areas, I'm not saying I want this edge to be sharp, this edge to be smooth. I just want a higher level idea. And then there's various abstraction levels that you can work at in Blender to walk down the ladder all the way until you get to the memory. And Blender ideally can do that for you. So users maybe don't need to think about attributes. But developers certainly do. And that's what I started thinking about two years ago. And the others were thinking about it even way before that. So this sort of idea where built-in types and user-defined types are in the same system. And you can attach data to multiple areas of the geometry. It didn't always used to be like that. The data types were more, they corresponded to what you wanted to do with them or where they were stored or in which list they showed up in the UI. So it was sort of mixed up. And we didn't have this generic idea that, in my opinion, really simplifies the way things could be. And they were stored very differently in ways that weren't great for performance. So I saw that and proposed something I called the mesh struct of a razor factor, which is not a great name for a project that I had no idea how big it would be. But the point is that when you get low-level programming, all the data you store is in things called structs. And attributes, they're a bunch of structs. So you can either store an array of structs. So that'd be when you put all of the data for one vertex in one chunk and then store that multiple times. Or you can store multiple attributes as arrays. So we're going to get a little technical for a bit and talk about how we actually store this data in memory. And we're going to start with vertices. So if we take two sort of basic things that we would want per vertex, the most obvious thing is just the positions. Where is the vertex? But then you might have some more advanced concept, like what is the up direction relative to the surface or relative to the faces around that. So we go to the memory. And here are two, well, one struct and then a bunch of bits inside the struct. So this is what vertices looked like in Blender two years ago. We had this struct called mvert. And m stands for mesh. Vert stands for vertex, obviously. And then there's the coordinates, so the position. And then the normal, which is that up vector we just saw. And then two sort of more esoteric things. One is the flag, which is like any C programmer will see that and know what it is. But those values on the right were packed into that flag. So we used each bit of the flag, because the flag is eight bits to store a different piece of data. And then because bevels were added to Blender at some point, we needed to store the bevel weight per vertex. So where else to put that, but in the mvert struct? So that's a lot of data in there. And it's a lot to cover. I want to focus on one thing. This specific flag called mevert temp tag. Temp stands for temporary. So it's used in a few algorithms in Blender to just store some data that we need. And then who knows when it's used. And then so the problem I saw with this is that this is a global piece of data in Blender. Everywhere in Blender that knows about a vertex knows about this temp tag. But it was only used in six or so places. So we basically make everywhere in Blender more complicated just because we need to store one true or false value per vertex in this one algorithm. So the way we can refactor this is to change it so we store each piece of data in an array, so in one chunk. So you're on the left to see all the positions are together. And then the normals. I should have explained what the top was. So if we go back, I sort of skipped over that. OK, so you see the blue stuff that corresponds to the positions. Then you have the normals. And you have four of them. So we have got four vertices there. And if we want to know the bevel weight, I don't know if you can tell the difference between the white and the yellow. But you have to walk over all the blue stuff and all the orange stuff. And then you finally find the bevel weight. So you want to know the next bevel weight. You walk over all the memory until you find the next yellow spot. And computers can deal with a lot of data very quickly. But you're dealing with maybe 20 times more data than you have to. And in Blender, we're dealing with millions and millions of vertices, and we need that to be fast. So that sort of way of structuring data just doesn't work. Well, we made it work for a long time. But now it looks more like this. And the point of putting these data structures here is just to show that these bigger bars of separated data like map to actual data structures in the same way, the struct we saw at the beginning did focus on the positions. Because probably 90% of algorithms in Blender just care about the positions. And users mostly just care about the positions. So we spend most of our time just dealing with that data, packing it together with all this less used data. It's not necessarily the fastest way to do things. And they're also generic. So other software stores the positions in just this exact same way. And now they're not special to meshes anymore. So the point cloud positions are stored in exactly the same way, which means the algorithms don't have to be specialized for meshes anymore. There's this one other sort of differentiation with normals. If you go think about that up vector, that's not something you'd necessarily control yourself. It's really just based on the faces. So we don't need to store that. And it's nice to say a lot of times when you're using a mesh, you want to just move it around. You don't care about the up direction. So we separate that out into a separate cache. To get very conceptual for a moment, just ask what is a vertex? Is it the actual place in memory where that data lives? Or is it just an index? So if you changed all of the attributes in a mesh, is the 27th vertex still the same vertex? It's a bit of a ship of theses question. But I'd say yes. So a vertex is not defined by the physical ones and zeros in one specific memory location. But it's really more of a conceptual idea. And that's sort of a big change compared to what we had before. OK, moving on quickly to edges, you see the same sort of thing. An edge is really just a connection between two vertices. And if a vertex is just an index, and that's what those two v1 and v2 entries are for. And then we had the crease and the bevel weight, and then also the same sort of flag. And the flag is huge. But we can simplify this. I'd say for the most basic meshes, we only need those first two things, because an edge is really just a connection between two vertices. All the extra data is only necessary in some situations. So we do the same change, and it looks more like this. There's first vertex, second vertex, and then we continue storing those into the future. And then all the other data, it's just in separate arrays at the end. So for most meshes, we only need the top one, and that's 2 thirds of the memory. And there are a lot of edges in meshes, millions often. So that makes a difference. And then to go into one attribute, so let's say the edge crease, which influences how a mesh is subdivided. Really, there's nothing especially related to an edge about this data. It's just data. We can use it anywhere. We can change it in any way, and that doesn't have to know about edges. So it's a simplification in that way, too. And I hope that even to a non-programmer, this might look simpler than the others struck before. Maybe. So faces. Faces are more complicated. They're a little harder to explain when you get to the internals. But I hope we can do it. So a face is a group of face corners. So the face corners, I've drawn in blue here. And we're focusing on one selected face to make the example. And then each face corner sort of points at a vertex. And you see those three vertices are also used by other faces in the vicinity. So a vertex can be used by many faces. That's why we have the separation between face corners and vertices. So here's a diagram to represent a sort of similar situation. So there's two triangles there. The triangle has three face corners. And each face corner represents a vertex or references a vertex. But there are also edges here. And the face corner references an edge. And because the face has a direction, so we walk around the face. And then the edge of the face corner is the next edge. That's a lot to visualize in your head. So I hope the diagram is hopeful mostly. OK. One more struct. This is how we stored faces two years ago. I think one obvious thing we can find is see the mattener that stands for material index, obviously. So I bet in a lot of meshes we deal with, there's only one material. The index is always going to be 0. We don't need to store 0 for every single face. There's also another interesting one that I just noticed. See that pad at the end? So that's one byte just used to make the memory fit together better. We'll get into the really technical reasoning for that. But we're wasting memory. And memory is not free. So I'm going to propose that we can use 1 third of the memory to store faces. And we'll do that by focusing on these two parts of the face. And I've made that diagram that we had before and changed it a little bit so that the faces reference the face corners. So here we have one quad, which has four face corners, then one triangle, which has three, and then another quad. So these two loop start and tote loop items, we have to learn another word. And that is loop, which is the word that Blender has used for maybe 10 years to describe a face corner. And I'll say it took me probably a year to learn what loop meant, which is why I'm trying to transition Blender to talk about corners instead of loops. But I wanted to be faithful to what the code used to say, so I didn't change the names here. But anyway, there are two variables. One is the start of the loops for that face. So you see for the first face, it's zero because the face starts at the first face corner. For the second face, it's four because it starts with the fifth face corner. You have to add one always, and then seven. So that's important because corners are always stored in one big array. Then the next one is tote loop, which is the number of face corners. So we already went through that the first face was a quad, so it has four corners and so on. I bet if I flip back and forth between these, I can start to see that the four and the three are sort of redundant because you can always get to four from zero by adding four. So we're repeating some of these numbers. And we realized that and said that, oh, we only need to store one integer per face to know where it starts in the array. That leaves the question, where does it end? But it just ends where the next one starts. So to know the size of a face, we go to the next offset and subtract the two. So let's say we want to know the size of the triangle. We know it starts at four, then we go to seven and subtract four, so the size is three. So it's a little bit of complexity there, but using one third of the memory for faces is very nice. And maybe I'll clarify why that matters, actually, because it's not just about using 100 megabytes instead of 300 megabytes, because you can always buy more RAM. It's also about how long does it take to go through that memory on the computer? And doing that isn't free, and it also fills up the caches in the CPU. So it's not just about using less memory, it's also about making all these algorithms faster. And OK, just one last thing about face corners, which we've already sort of covered. We can sort of predict the pattern by now. We take this struct, we split it up, we store it in arrays. Tell me if I'm going too fast, but the concepts are very similar here. It's just more of the same changes. So face corner references corners and edges. Most of the time, you only care about one of them at a time. Let's say you want to know all the positions in a face. Well, you only need to know about the vertices. So see how they were packed together before? You would have to read twice the memory. So now you can only go through, you only need to go through that blue chunk if you care about the face centers, for example. So that makes a lot of these algorithms about 30% faster, just if you have a simple iteration over faces. I mean, at least that's what we observed in the refactors. So it's not like a game-changing making blender four times faster sort of thing. It's mostly about doing the basic changes at the lowest level possible so that everything else can benefit from it. And maybe this is meaningful to some programmers, but it's just the difference in the iteration before and after. So before, you would iterate through the corners or the loops of a face and then access the vertex from that corner. But now you just iterate through the vertices in a face. So that's a lot of changes and a lot of structs. But what did the refactor actually look like as we were doing this, how are we spending our time? So there's the M loop struct. I don't know if hopefully that's readable. And we searched it. So we want to remove this thing. How much is it actually in blender? Well, about 1,000 times. So start there. Take the first one and change it in the same way. I showed the loops being different. Just be doing the same sort of change in a very different way 1,000 times. So one week later, there might be 900 uses of the struct. Three weeks later, you might be down to 600. And then two months later, doing the sort of refactoring, maybe they're all gone. Then we can make the change and change the way we read memory once it goes through a view. And that's the same sort of thing for edges and faces. There are about 1,000 changes needed for most of these. So that ended up taking two years. Half the work is convincing other people that the changes are worth it. It helps a lot to have someone actually interested in doing the work. But that was me and others, thankfully. And then 24,000 lines of code changed and removed. One thing that I'm really happy with with those numbers is that we didn't add much more code than we removed. But that's a thing a developer would care a lot about, not necessarily artists using the software. But yeah, so you can't change code without breaking things, especially thousands of changes multiple times in code that's maybe sort of old. So in preparing this talk, I went through all of the commits and looked at all of the bugs referencing the commits. And I counted about 94. I'm sure this isn't all of them. So these are all the bugs, like weeks more than one. So the screw modifier doesn't work. The copy location constraint doesn't work. One anecdote I heard a lot working remotely is that, ugh, Blender's broken. That crashes when I enter edit mode. But Hans is asleep. What are we going to do? So yeah, luckily, there was a lot of triageurs. And people actually interested in testing Blender and reporting these bugs. The whole project is impossible without that. So you can also report that I found about 95, 100 fixes. Some fixes fixed multiple bugs at the same time. Some fixed bugs that weren't even reported. But it will also say that some of these factors also fixed bugs that were already in Blender. So we're not doing terribly. But yeah, that's a lot of breakage. And these are all high priority reports because they're regressions. There's something that we generally need to stop our other work to solve. So that's not trivial at all. And these changes come with a cost. In going through all these commits, I sort of found some areas that kept breaking over and over. Those might be the old particle system, or baking, especially baking from multires, modifiers, or undo in sculpt mode. Sort of these areas that, as a developer, you look at them and they're a little suspicious because they're old and the code is maybe not aged so well. But some code was nicer to work with. Cycles, for example, has tons of tests. So if something is broken, you generally know about it before it gets into the hands of users. Same with geometry nodes, that's been improving. And if I notice a bug and fix it, then it can become a test. That's very easy. But in retrospect, it would have been nice to take some of those sketchy areas and add tests before starting the project. It's easy to say that now. But we didn't really know how big this project would be beforehand in our naivete, or my naivete. But I think another way to phrase these sort of things is that they're growing pains. And you need to go through sort of change to bring Blender into the future, where people care about having lots of geometry and improving performance. So it was necessary, I think, at least some of that pain. But that begs the question, why? Like, when does someone start a project like that? And is it worth it in the end? And yes, I would answer that it is. But it's also not that simple. And I've learned some things about myself and how I focus on work that I could not talk about. And one of those is obsession. And for me, Blender started as a hobby. And as a programmer, I couldn't help turning that hobby into programming. But the more time you spend on Blender, the more it becomes a passion. And then if you're working on things that other people care about, maybe it becomes a job. So there's a work week right there, which is a bit of a lie, because when your passion becomes your job, maybe it becomes an obsession. I'm sure that's probably familiar to people in here. But I look back on that, and I know that there's nothing essential about this project that needed it to be an obsession. I know it needed to be a passion. No one refactors legacy code for two years without some sort of passion. But yeah, it's something I've learned about myself. And maybe about Blender too, because looking back on a bunch of important things that have been part of Blender that everyone relies on, there are a lot of, I see, people's obsessions and obsessive work on improving things. That's really cool. But it's, for me, I've learned the importance of balance and just working sustainably. OK, that little divergence is over. Talk about a couple more technical things. But this topic is really cool. And it's sort of the last piece of the performance puzzle that we can use to take advantage of all these changes and push Blender's performance way farther. Implicit sharing. It's a weird word. You might have heard of copy on write before. Something that Blender has tried to do in the past. And I'm going to try to explain it in a very simple way with just a cube. So here's a cube. It has four attributes attached to it. Or rather, it uses the data of four attributes. If we copy this cube, we can reuse the same data. And let's say this is actually a huge cube and it has 100 million points. Well, that's a ton of work we've just avoided doing. And it's not just RAM. It's also we didn't spend the time copying all the data. Now, we had a third cube. It's still using the same data. This is like instancing. So you can draw all these cubes with much less work. OK, now we deformed the last cube. We changed the positions. That means we need a new position array. So we just did that just sort of transparently. And that's why it's called implicit sharing. It's because we stop sharing the moment we can't share anymore. And for some setups in geometry nodes or some procedural processes, this can sort of make a huge difference. It's sort of easy to construct a setup that's as fast as I want it to be. Like it can make 100 times faster by just adding a ton of attributes and then duplicating the mesh twice. Suddenly, you're not doing all the copying that you were doing before. So that's sort of only possible if these attributes aren't stored together. And actually, I'll go back just to show that. So let's say we combined the blue, orange, yellow, and white in the same sort of packed fashion we did before. You can't really imagine duplicating just a bunch of those little blue packages. You can only do it either all or nothing. And that's why it's important to separate these things. Because you will often copy a mesh and only change where the vertices are, not any of the edges or faces. So Blender in the past has copied. Let's say you open one Blender session and it has a large mesh in it. In the past, Blender has copied all that data about four times. So one is the original data, which is just from the file. There's no way you're going to get around storing that. But then we've also copied that to make an evaluated object. So the object after modifiers have been applied. And then the first undo step also gets a copy of the data. Then when you want to take that data and bring it to the GPU, whether that's Cycles or EV, they're also copying the data. And that takes time, and it's also wasting memory. So far, we've eliminated the copy of the evaluated object. So that's 25% less memory just when you open Blender with a large mesh. To go back to the destructive arrays refactor thing, curves also have a very similar concept. It's just a little more extreme than the mesh version. And when we did these changes to use the new data structure for curves, we found that 10 times faster was sort of in the range of what we could expect. And maybe this slide helps explain it, because these same structs that were relatively simple for meshes, they weren't so for curves. So you have one curve point. With all this animation data in it, the handle types packed together, and then the tilt, and the weight, and then something called a soft-body goal weight, which was used for soft-body simulations. And whenever you wanted to copy a curve, you copied all this data. And the thing on the right is a NERB, which is another word for a curve in 20, 30-year-old Blender lingo. And the same way we changed faces to only use one integer, we can change the curve to only use one integer. So it's sort of massively different nowadays. This is the same sort of diagram for two curve points. It's just a whole bunch of data jumbled together, but I really won't force us to go through the whole process of refactoring that together. Only to say, that's what it looks like now. That's just a bunch of attributes. The handle type is an attribute. The position is an attribute. The radius is an attribute. And these are all pieces of data that you can copy or maybe not copy if you're using implicit sharing. There we are. We actually have plenty of time on quickly. What's next? Sort of taking these same concepts and using them in a few more important places. So vertex groups are one holdout that hasn't fully transitioned to attributes. If you imagine one vertex can be in multiple vertex groups, that complicates it in a bit. One vertex can also not be in any vertex groups. Then there's things like shape keys, more internal things like the original coordinates. So sort of just taking this idea of using the same internal storage types and using them for all of the data to add flexibility is the next step. We're getting there. And then the other thing that I think will be really important is using this idea of implicit sharing but using it everywhere. So let's say you want to have two meshes and evaluate them on different frames for something like onion skinning. If you don't have to duplicate anything, that can be very cheap. But if you do, then it's basically prohibitively expensive. And then cycles can use that. Undo steps can use it. And we see that can make blender 100 times faster in some weird production setups. That's the last slide. So maybe people have questions. Yeah, that's two really great questions. So the first one was about the name loops and corners and whether we're sort of transitioning more completely to the corner name. And I say we need a little more consensus before I start doing things like a find and replace because there's a lot of code out there and just going and changing everything is not quite worth it. And then there's some, like I said, before convincing people is sort of half the work. And it's really worth it to make a compelling argument because the sort of change should feel obvious. I hope that some of the changes to the structs felt like obvious. Like we shouldn't store the material index of zero for every single face. Some of them aren't. And there are maybe benefits to the loop name too. The other question was about B-Mesh which is the edit mode data structure which your right is not affected by any of this at all because sort of its performance characteristics are entirely different. It'd be nice to sort of use a more data oriented approach to store edit meshes, but that's a huge project that probably can't get into right now. Yeah, yeah, yeah. I mean, there are also things like B-Mesh has the same concept of loop. Maybe if we renamed loop for meshes we could sort of keep things similar. Yeah, good question. Okay, so the question was there was the past with the old data structures and the present with the new ones. When did we, what's the splitting point? And the answer is different if you're a user or a developer. For me, there was no splitting point is sort of, as these things were finished, they went into Blender and people started using it. The really important distinction was that we wrote the old format into Blender files. So you didn't really benefit from the changes in the Blender file itself. That's so that we could save the mesh and still be able to open it in an old Blender version or the forward compatibility. But in 4.0, which is releasing in a week and a half, I guess, we did make the switch. That was also a painful process, which once we made the switch, if you open that file in any older Blender version, that older Blender version would just crash. So that's another thing where in retrospect, maybe we would have structured the project a little bit differently. But that's a bit, that's better nowadays, so that, that's when you'll start seeing the benefits in Blend files. Yeah, so the question is like 10X, wow, that's a lot. Can the next steps give that same sort of improvement? That's why I have to be really careful saying things like 10X performance improvement, because it's really easy. I'm knowing the data structures and knowing that copying attributes was slow before, I could make you a file where it's a million times faster or where your computer crashed before because it ran out of memory. I think implicit sharing will make things possible that just weren't possible before. I don't, it's hard to say like whether we can make things 10X faster again. But I think if the data is stored in a way that's conducive to performance, conducive to the way that computers actually work with memory, that opens a lot of options that you wouldn't have before. One of the interesting ones is SIMD, so single instruction multiple data, which is what a lot of newer CPUs can work on multiple bits of data at the same time. And we can do that better if everything is packed together. So maybe that's another ironic thing about adding performance. The question was how many more vertices can I add? So in the end, we didn't make anything faster, we just made things bigger, which people also care about. But I mean, you can do the sort of calculation, let's say you were to use up your memory before, and now you have more space. So there's also different bottlenecks. One might be memory bandwidth and the cache you should use on your CPU, but another might be like how many multiplications can I do in a millisecond? So there's sort of different answers. And it's also, it's a hard question to answer because, and arguably I should be able to have a better answer for you. And I should have had more testing at this point to say like, oh, this is the actual percentage improvement in Sculpt mode, but I guess we'll see. I'm going to try to rephrase your question to make sure I understand it. So the question is for arrays of Booleans, like the true or false attributes, those are stored as eight bits, which is really more than we need when it's only one true or false value. And that's a really good point and that's something I hope we can optimize in the future, which is eight times less memory. It's, if it's free, let's take it. Yeah, go ahead. So the question is about curves and how related are these concepts? And yes, their grease pencil is sort of using the same sort of let's store the data in a way that matches the way computers work. And it's using that approach. They'll say they're in the past two years, there have been three different curve systems. One was the old one, that's from 25 years ago. The other one was a data structure we wrote temporarily that was sort of halfway in between. And then now curves are hair, they're the same thing. And it's just two groups of arrays, two groups of attributes, one for curves and one for points, and they're just a bunch of gigantic arrays. So the question is about the difference between linked data or implicit sharing. And the difference is that you can use implicit sharing without knowing about it. So if you copy a mesh, you have two meshes. There's on the logical side of things, there's no relation between them. Only that they use the same memory. And the artist doesn't have to care about that and the developer doesn't have to care about that either. Just the moment you change something, then it's duplicated. So the question is about lazy calculation. I touched on that a little bit with normals and I meant to talk about it a little bit more. But that basically means don't calculate the normals until you actually need them. And we do the same thing for which edges don't have any, which are loose edges. We do that a lot more now than we used to. The idea is we don't need to write which edges are loose into the file. Let's just calculate it when we need it and we often don't need it. So now that's done for topology maps, so the corners around of vertex. Which also means when you copy a mesh, you don't have to recalculate those maps. So I guess more and more is the answer, but we have to be careful doing it too much when it's actually not that expensive to compute the data. I think we have time for like one more. You've had your hand up. So the question is about removing a vertex. Is that sort of slower now? And I think the answer is no compared to before for the actual mesh structure. So before you might have had one array and when you remove an item from the middle, you have to copy all of the stuff at the end so that there's no gaps. The difference now is that then you have to do that in a bunch of different arrays, but that's not really slower because the arrays are smaller. The difference compared to the edit mesh structures where there's a real advantage for B mesh is because you can remove a vertex and leave a hole and then you don't have to do anything. So that's one reason you can't just say, let's replace everything with this new data structure. I think we're out of time, but feel free to ask me a question in person later on. Thanks.