 All right. So, welcome. As I mentioned, our previously scheduled talk on Ghetto, our speaker was unable to attend to an unforeseen illness, and so my name is Seth Hillbrand. I'm officially one of the lead developers for the KeyCAD team. Oh, thank you for reminding me. But today, I'm going to talk about something that's only tangentially related to KeyCAD because Wayne already had all of the details that you need from the very beginnings of KeyCAD. So, I'm going to talk about a pet project of mine on how one of the issues that we run into in KeyCAD. KeyCAD has, if you're not aware, one of the largest, single largest 3D model libraries for electronic components in existence. There are thousands and thousands and thousands of 3D models that are open, open source, freely available. We use CAD query, if Adam's still here too, yes, we use CAD query predominantly to generate, and we use CAD query through FreeCAD. And FreeCAD was also talking earlier today, and we output using OpenCascade, our talk just finished, to do step files. Now, with so many files, we have kind of a unique process where we're not actually generating a single model that we then output to step files. In this case, our dedicated librarians who honestly are vastly underappreciated in the whole KeyCAD ecosystem, they work tirelessly to expand and curate and create these models for people to integrate in their electronic designs. Now, the difficulty becomes we have to store all of these models, and when you go out to a package manager, the package manager also has to generate the packages for these files. And what that looks like on a package manager, say like Fedora's, is you will first download the repository, and then you will install the repository inside the CH route, and take that and then archive the output of that into your final package. That means you have three separate copies of whatever the output is. In the case of KeyCAD, we're pushing about six gigabytes worth of 3D models that get downloaded, installed, and then finally packaged, meaning that the build environment for package managers gets larger and larger, and for our end users, the downloads get larger and larger and larger. This is an awesome problem to have. It's a great problem to have, and it's still a problem, and so maybe there are some ways that we can address that in the future. So, step files universally accepted in any 3D editing and modeling program worth at Salt. It is an open specification in the sense that you can go and look what the specification is, and if you can read it without actually having someone walk you through it the first time, then I really want to meet you, because you're a very smart person and you can teach me a lot. However, it is an open specification, and so we can look at it and we can understand exactly how we utilize it in order to potentially optimize what we see from the output. So, in this case, step files are inherently redundant. They're text-based, and we have output information for, for example, here I'm going to try to use my cursor here. Oh, there it is. So, we have output, in this case, we have a 16-pin dip package here from the KeyCAD model library, and we have 16 pins. Those are all exactly the same pins. All you're doing is you're taking pin, pin, pin, pin, turn around, pin, pin, pin, pin. You are making the exact same part, and if you were to utilize all of the tools that Step gives you, you could just make that one part and say, copy it in this offset and change the offset, but you can also generate it in another fashion, and all of the step file output engines will typically generate it in another fashion, and that fashion is I have pin one, and I'm going to start by creating the origin, and then I'm going to create a vector from that origin, and now I can extrude along that vector. Oh, I need to extrude again in the same direction while I start with another origin, and then I create another vector, and then I extrude along. So you have this repeated function throughout the file. It's a very safe way of generating. But what happens is, when you generate with this safe way of generating, we're 5.8 gigabytes worth of data, and Diptrace, which also has an open, I'm not sure if people are aware of this, another EDA software package, they provide an open library as well in that they provide the step files that you can go out and download there at about 4.7 gigabytes total for all of their models, and manufacturer models, if you go on, say, Samtech for some high density throughput, 300 pin connectors, that 300 pin connector is going to cost you 40 megabytes worth of step file, because it's generating an origin and a vector for every separate element that it is extruding in that model, but you get your information. The end result is good for the user because you get guaranteed information as long as you're willing to give up that disk space, that bandwidth, and critically, that load time, because once you have to load up that 50 gigabytes worth of data for that model, every time you do that, you need to pull all of those data in and that slows down your overall process, which slows down your iteration, and so we can optimize this a little bit. So there's content redundancy, I'm gonna, this is actually a step file, you're looking inside, you see the text here, and that first element, this step file happens to come from the OpenCascade 6.6 processor, so it's a little outdated, but it's still generally valid. The first one is a key, it's a string key in the step format, it just needs to be unique, and what the output from OpenCascade gives is it gives you a unique number, which is a perfectly valid way of doing that, and then everything else will reference those unique numbers, so you can see on here, say, say, you know, element 11 here is setting up an axis placement in three dimensions that references elements 12, 13, and 14, so you're gonna look at 12, 13, and 14, well 12 is a 0, 0, 0 Cartesian point, 13 is a direction in the Z-hat direction, right? One in the Z-hat, and 14 is a direction in the X direction, so what we're doing here is we're setting up a coordinate in the Z-x plane, or we're setting up a plane, the Z-x plane. Now we're gonna do something else, we're gonna say, all right, we want another plane, what do we start with? We start with the origin, or in our, we don't go back to the original origin, we make a new origin, and so there's a new origin here that is also a reference, but step files are inherently references, so what we can do if we intelligently walk through this file is that we can take, we can look at that, and say, this reference is the same as this reference, all of the origins are exactly the same, we don't need to output them, we just remember what number was the first one, and we tell everyone to look back at that original reference, so all of these, all of these can be incrementally optimized to bring it down to a much faster load speed in a substantially smaller direction, and how do you do that? Well, you step through, and you can either parse out the individual elements, in which case you have to recognize there are in the step format, you can have, trying to remember the exact number, I believe it's about 400 different commands, so you could parse those 400 different commands and figure out how you do that reference, or you do it something a little bit A, safer, and perhaps more naive, which is what I chose to do here, and you do a string comparison. It's also perfectly valid, because if you have two lines that represent the exact same string, they're the exact same thing, so what we do is we take all of those and we put them together, these exact same lines from the exact same file, and now we have our Cartesian point here, number 12, and instead of when we're setting up the two planes, both planes in number 11, references the origin at number 12, and the second plane, number 15, references the origin 12, rather than setting up a new origin every single time. So this works out pretty well for us, because what can we do with, say, an individual model? In this case, a QFN68, what we're looking at here is a 68 non-leaded quad flat pack here, that what we get here, originally, I have two here that I pulled from Diptrace, and one from the KeyCAD library. The KeyCADs was originally about 1.7 megabytes for this one file that represents a 64 pin model, and after you take that down and go get rid of all the repetition and just put in references, well, we're down to about 600 and 660 kilobytes, so we lose a little over 50% of that file size just from repetition, same thing with Diptrace. We go from about a one megabyte file to a 530 kilobytes file, we save this without losing any information, and that's the critical part. We're not talking about compressing this and actually losing any information about the content of the step file. What we're doing is we're saying we don't need the same information in multiple places in the step file, and step supports this natively, so we're not creating a different file format, we're just creating a step file format that is actually slightly more optimal. Critically, we leave a few cases of duplication in there, where you might be able to reduce it further if you wanted to. You might notice here that element 14 is the same element as element 16. Element 14 is a direction in the one zero, negative zero direction, and element 16 is in the one zero positive zero direction. This makes sense to computer scientists and not so much to physicists. So in this case, we actually have the exact same direction, but we don't want to bring it down any further because this representation for a computer scientist is actually something different. We don't want to make an unnecessary optimization that we might end up regretting. So, we play it safe on this front. We get our compression without the additional issues. Now, immediately when you talk about this, talk about reducing step files, the first thing that people say is, well, what about step Z? Step Z is great. In case you don't know, step Z is a light wrapper of Zlib that goes over a step file, and Zlib does a Huffman Tree compression, so you take the Huffman Tree and you represent all of the data, and you get generalized textual compression that when you repeat text commands over and over and over again, Zlib will kind of create a code tree that represents those more effectively in a binary format. So you get a binary output, so automatically you get about 50% savings from the binary output, and then you also get to provide a reference window for the Huffman encoding to take commands, the later commands that get referenced back to an earlier window if they repeat a lot. So you get a lot more compression, but that's an orthogonal compression to what we're doing, because we take a first level pass understanding what the references mean in the step files, and then intelligently replacing those repeated commands with step references themselves. So you get to combine these two approaches together, and suddenly you get the benefit from the step Z format, as well as this additional benefit from our replacement technology. We combine these together, and independently, step Z is pretty good. So here you can see again, dip trace on the top and key cap models on the bottom. Just going to step Z, you go from one megabyte in the dip trace model to 176 kilobytes in step Z, but if you start already from the 500 kilobyte step reduce function, and take that step reduced file, you also get that additional compression on top of that. So you get down even further. You want to take both of these methods together, and this gives you the most bang for your buck. What does that get you? Well, that gets you the ability to support a much larger library for a given bandwidth use, for a given hard drive storage, and it prevents having to deal with resizing easily. Each of your build partitions in order to support a rapidly growing 3D model library. There are also some additional savings in server storage, as well as load time that you get to benefit from. Larger benefits even from the key cap models. The key cap models are, in case you didn't know this, we're a little bit more accurate than the dip trace model. So we have a little more information in there. So we actually get a larger benefit from this, but companies like Samtech, similarly, the output of that is similarly small. So how do you know after you've done this that you get the exact same thing out that you were putting in? And critically, the 3D models, any MCAD person will tell you, if you're modifying a 3D model, you need to make sure that nothing has physically changed in that output of the 3D model. Because if you change an output in the 3D model, you might as well just break the whole thing. So it doesn't matter if we save anything, if we don't get an accurate output. And the naive way, of course, is to go through and just kind of look at it, and you say, all right, you know, there's a regular one on the left, and there's a compressed one on the right, and they look about the same. They look like they are probably, but we're not picking up all of the details. We're not critically going to be able to evaluate hundreds of thousands, potentially, when we get hundreds, we're only at a few thousand right now, but eventually, we're not gonna be able to evaluate every single model that is generated in the library to ensure that we have an accurate transformation by eye. We just don't have the manpower. So what do you need to do? You need to evaluate this in the code. So what do we do with the reduction code is we go back to OpenCascade, and you just heard about all of the Boolean operators at OpenCascade supports on solid models. This is fantastic, this is exactly what we need, because we take the original one, and then we load up our reduced one, and if we get the same thing in OpenCascade, loading one file and loading the other file, then if we subtract one from the other, we get zero. And so that's exactly what we do. We use a BREP algo API to cut the two models between each other, and if we are successful, then we don't see a difference. And in fact, we do get a null result for our entire library on the difference between loading OpenCascade for the zeroth order models that are directly output from CAD query, as well as our smaller models. So what does that overall get us? Where do we stand in a larger library? Well, the KeyCAD step library goes from 5.8 to 1.5. So this means that we can support four times as many models, four times as many models within the same build volume, and within the same download. So, but more importantly, for the work that I do, when I'm sharing models, I'm usually sharing it over email with colleagues at remote workspaces, because not everyone is on board with Git yet. So sending over an email as soon as you hit that 20 megabyte or so limit, most of your mail servers are going to kick it back, at least mine will. And so I always need to get these large board models down to something that is addressable for a mail server. And so the largest board that I've had to output so far is about 60 megabytes. And now we can, now I actually, with this, I can actually get that through the mail server. So this is what I get, and where this goes. So this is a single use library. You can check it out for yourself. If you want to take a look, give it a test drive. Let me know if you find any issues with it. So GitLab address here for the actual library. It's just command line utility, call it on your file, give it a different output, and you can see the two. So thank you for your time. I hope I'll take any questions.