 Okay, thank you everyone. Good morning, and I'm Bruno Schmidt. I'm here to present today the PFL logic synthesis libraries Which is it is the result of the combined work of many people in our LSI research group at the PFL So we have this collection of libraries for open source infrastructure to do logic synthesis We have state-of-the-art algorithms We target both conventional logic synthesis classical logic synthesis and also quantum compilation more recently and our libraries they have MIT license as far as I remember So a brief of the outline of this talk. I will talk a little bit about logic synthesis. I'm sorry about that Okay Hopefully the other it was fine before well Let's try to go like that. So I'll talk a little bit about like synthesis then the motivations and goals of the libraries then a bit about their implementation and then give an example of how to use them to do something interesting so Like synthesis the process by which we take some abstract specification of a circuit behavior And then we map to some technology dependent logic primitives. So we have a specification for example in RTL Very long VHDL, and then we want to map it into a FPGA and we know that at some point we need To map it to some loot network this LUT net lookup table networks that lies inside the FPGAs To do so we use a bunch of different ways of representing logic So basically we take our specification We will use one of these ways of representing you have products of some some of products logic networks the size decision diagrams and truth tables and Basically logic synthesis is what you do. We start with one representation. We do some optimization we We go to another representation do some more optimization and We keep transforming until we are happy and we do some technology mapping So why does the motivation and goals of our of these libraries? So We're doing research in logic synthesis, and we are a bunch of different researchers We all did a specific things and what happened the law did that we are all Reimplementing we're inventing the wheel every time we wanted something. So basically to we use Confectionality amongst us and amongst the community. That's what we are hoping for so We really wanted those libraries to be easy to integrate easy to adapt for our needs and easy to contribute They are quite modular. So each library will Target one a specific task in logic synthesis and And then we can compose then to create bigger frameworks for example circuit if you guys know about this It is today just The composition of all these libraries tool to do logic synthesis. We are also motivated by some lessons learned from the development from Berkeley's ABC The Alan Michenko its creator and maintainer I give us a talk and then where he shared before office experience and things that you have Then different if you were to implement ABC now, so we try to keep that in mind as well So these are basically the the nine libraries that you have to state And as I told you told you each one will You target a specific Thing on when doing logic synthesis a bit about an implementation before we go on that into the Into the example. So they are in plan. They are modern implementations. They use C++ 14 or 17 They are had their only they have almost no dependency and when they have some dependency It's usually either only and shipped together with the library So it's just like look at the folder and start playing with it They are well documented and well tested to some extent. We're still working on tests It's a hard thing So for this presentation Because of the time limit, I will only focus on this for libraries and show you how you can compose this for libraries to go from a Specification to I let me mapping that can then be given to a placing route and then put on our FPGA So let's move on into the example We start with the specification and then we wanted to end in this LUT network And for the sake of making this presentation interesting the LUTs will have a limit size of three Otherwise it would be like too easy. We'll just use one LUT so What is the function or the circuit that I want to implement is a combination of circuit I call it prime 4 which Basically take us input a four-bit Number and we'll output one in the case of this bit is prime zero otherwise quite easy example Here I have some description of very long of This behavior and then you might be looking at it and say like okay This guy has no idea how to do very long description Then there is a real good reason for that, but it's also true. I'm more of a VHDL guy So This is the bird eyes view of what I want to show to you we'll do in three steps We'll first take the the description and then extract the logic out of it a logical representation out of it and then Last we'll do the technology mapping. Hopefully we've had time I would show how to do some optimization In between those two steps, so how do you do the parsing for parsing? We will use the Lorena library Which has a collection of parsers for various file formats using logic synthesis I have Iger, Bench, Blitz, PLA, Verilog, but very simple Gate-based Verilog and that's the reason why the Verilog looked so weird in the previous slide. We also have some lib able to parse liberty files that still have working progress and The library itself it is lightweight and quite customizable Basically the parsers they will read the input and invoke callbacks Whenever a parsing primitive is completed So you need something to interface with this library and what we use and what we will use in this presentation is Or a large network library mock turtle that provide virus logic network representations and Implement several reader callbacks for Ager, Bench, PLA and Verilog. So now we will use the Verilog parser and by parsing the input we can get this Network representation which basically on your left you have this an inverter graph So you have the inputs on the bottom as the triangles you have one output at the top and a bunch of nodes in between In there there are two input end gates and the in edges that connect those gates and then when you have a dashed egg It means there is a inverter You also can use Majority in vertical graph representation that you have on the right Which the main difference lies on the fact that the nodes are three input majority gates It also has this this constant zero node right there that is also present on the IG But as I need to need to use it here I omitted and I will be omitting for the rest of the presentation, but these are large networks after optimization actually when we read there Directly they look much messier and then they would not fit in this light the IG actually there So basically when we first read the IG this is when first read the Verilog This is the end inverter graph that represents the logic of the prime for input so that we talk about a little bit about the Moktardo library and It is based on this philosophy of four layers. You have a network interface API that basically define some naming conventions and methods for classes that implement the the network Structures and with that allows us to implement the algorithms in a more generic way It then this will be understood in a few slides We have the algorithms themselves that use the API and then we have algorithms for large synthesis optimization and technology mapping and then we have a bunch of Network implementation so you want you might want to represent your logic as an inverter graph a majority inverter graph maybe X or majority graphs X or n drafts KLT networks or Whatever kind of network implementation you want to implement yourself and as long as you keep to our naming Conventions all the algorithms that we have implemented there to do optimization and mapping should work Lastly we have this layer for performance tweaks where we basically you can specialize some algorithms to work better in In some in some networks, and I will not go into details of this and I'm not even sure if we are using this right now so Okay, we have our or an inverter graph and now we want to What we want to do is map it so we have a bunch of algorithms We have cut enumeration all into mapping node residences cut rewriting refactoring a bunch of different algorithms and If you have no idea what this is it is not such a problem because We are well documented so basically you can go to the web page in our documentation And it will explain what the algorithm does how it does it really Details and then sometimes too many details quite verbose, but then you can understand and learn what's going on right there So let's begin with the LUT mapping and then basically our goal is to cut our Boolean network In a way that there's the different pieces can fit into our LUT 3 Primitive and we call this piece is cuts so we cut it into cuts and And there are many different ways to do that So basically we could do it like this and you see that our resulting LUT network would use seven Seven LUTs But of course there are many ways of actually cutting a network So what we do first it is that we do cut enumeration and we do it Bounding some parameters we bound the size of the LUT because we only are interested in LUTs We've maximum with cuts with maximum of three inputs And then we also give a maximum number of cuts because we don't want to keep enumerating different cuts forever so we take that that Disrepresentation and then we can enumerate a little bit more cuts And then when you're happy about the number of cuts that we have what we do It is we select a set of cut that will map the whole not logic network in our M here It is to find a good map mapping with respect to some cost function that you can define maybe you're interested in you have Shorter critical path delay or maybe you're you're interested in have you know using a smart number of LUTs So by selecting Different cuts instead of having this mapping we could do it slightly better or we can do it even better and Not only minimize in the number of LUTs, but the number of large levels in this in this network So by now I could have okay. I'm done, but there will be no fun So let's look at how to do a bit of optimization Using a mock turtle So for optimization or you do cut rewriting and then basically because you already presented what how to cut the network So what this does this algorithm will try to rewrite cuts in terms of another set of nodes So we'll take one cut the cut is a subnetwork itself so we try to minimize the number of nodes inside that cut and For example, let's take for example this cut we want to compute a replacement for it so we want a function that implements the same functionality that it is Being realized by this by this set of nodes and the first and the first thing we do it is actually we want to Represent more explicitly what the function that this cut is implementing is actually doing and for that we use a truth table So and when when dealing with truth table kitty Library comes quite handy. It provides two data structures to manipulate Truth tables we have Static truth tables dynamic truth tables and the main difference is that you know the number of variables at compile time or at run time And then you can and then some of its algorithms will be faster if you know at run at compile time what is the And the number of variables it also provides you several algorithms to do some operations between truth tables Finding implicants canonization npn spectral if you have no idea what that is you can again look at the documentation It is there it explained so We take the cut and we generate a truth table for it And then we can generate the truth table by basically just out of putting all inputting all possible Input combinations and then see what happens in there in our and then we get our truth table and now we want to synthesize a new network for this That will implement the same function and to do so we use exact synthesis Which given a specification in this case the truth table it will find an optimum Boolean network where the optimality optimality is defined with respect to some cost function So for that we use Percy exact synthesis library it It offers a collection of different set solvers at synthesis methods for circuit resynthesis design exploration or a function classification it is easy It is easy to prototype and its experiment with topology based synthesis different encodings different set solvers back ends and Parallel synthesis So when we put this through Percy it gives us a Slightly better network for this cut that uses instead of five four nodes so The algorithm of cutting we writing basically compute potential replacements like this for our cuts and then heuristically select a maximum set of cuts that are no conflicting between them and And man maximize some of the overall gain that we're trying to To get so basically in this example I'm trying to do it is minimize the number of LUTs that I will be using so By applying the algorithm to the different cuts we can actually get this network that can be cut and use only three LUTs in the end so we were able to squeeze out One more LUT from our previous selection by using this disoptimization so this is a basically a simple summary of what we see so You start from a very long you extract the logic representation of it in this case a AG you optimize or Your circuit using cut rewriting or some of the other techniques that we provide and then You map it to the To LUT network so so far Everything is quite easy, and then I have make it easy and masked some of the details of the implementation but Talking is easy How is the code look like to do all these steps that I just told you and It is basically this code so here we first Load the network and we parse the input and create the network Then we do The cut rewriting and when you do cut rewriting you end up with some dangling nodes in your network So you need to clean it up afterwards And then you map it to your To LUT it is as simple as that and enough afterwards you can write in one of the formats that can then be the input for some placing routing To that comes afterwards so here we have some other some other example that also use Alice and Most of showcase and examples you can you can look at the github repository so basically this example provides a simple shell interface using the Alice library it loads some truth tables and then Do some binary Load truth tables from binary and axiom of decimal strings loads from bench do some NMP any colonization and And then you can play around and then To have something like this it would be more kind of tool it only takes the 268 lines of code from the user's point of view for for the people who are actually composing the libraries and doing their thing so With that I came to the end of my presentation I might use this I do have some other slides on the other a bit of the other Libraries, but as these lines are not so nice with this So I think we can end it here and then go for the questions So does the tool handle multiple output? Yes, it does it does handle multiple outputs It is just that for those lights the sake of simplicity we're just using one output, but it can handle multiple output So, yes, so how the models Connect to each other is that what you're asking so basically For example mock turtle already has embedded in in it that some of the other libraries So Mock turtle already has the Lorena library inside it and the kitty and Percy most You see that Mock turtle because of what it does Has most of the other models already in it. So basically they have their own namespace there But they already come with the code itself. I Don't know Yeah, so exactly so when you get when you get Mock turtle you get more almost everything if not everything But maybe you don't want to do with logic Networks at all you want just truth tables so you can only you don't need all the all the rest So you can only tick tick kitty Mock turtle will be the one that has everything kind of it is When everything comes together, but Alice then when you have Alice, then you have the shell interface and then all this does logging You can create commands and stuff like that also with Alice you can interface the code with Python So you have you have Python bindings as well So we have some flows that use Python and then you can easily script and hack some of these optimizations and how Together so the question is if we can override cost functions. Yes, you can pass cost functions as parameters and then that's basically what we do in Some of these algorithms are also using in this quantum compilation flow and then we in quantum We do use different cost functions. So they are Configurable as long as there's object that can be called as a function. You can pass it as Anymore So how does this fit in a broader system, that's the question so basically whenever you use something like ABC you could More or less replace it if it is in the logic synthesis you could more or less replace it with With the libraries that we have here ABC has more things in law in formal verification as well, but for example in this last semester we gave there was a lecture in a DAA in a pfl where the students were asked to to use those libraries with Yoses and We definitely want to do more integration with Yoses. We just do not know how to use the very lot parser in Yoses To harness and get all the different the combinational parts and then we could do optimization there, but The goal here it is just like to expose ourselves and then try to put ourselves in this in this other Flows and then we're quite open and we have been talking to other groups that are willing to use those Those libraries in their in their whole flows Yeah, exactly. So basically Actually, if I wanted to do this example, I would still need to tweak a little bit because the default is look up table Of four. Oh, okay. Sorry. I need to repeat the question so the question is if I can change for different sizes of lookup tables and Yes, you can you can configure yourself what the size of the look that will be so Not really basically So have we considered to use some kind of intermediate representation to plug in other flows and so we already support many of The intermediate representations in large synthesis is basically Some of those flows can can output a brief file for example, and then we can read it You can pass it brief and then we can output brief. So that will be more or less the This intermediate representation you're talking about all the other RTL languages more in the in between I don't know the problem with RTL languages quite the parsers are quite complex and Requires a lot of work to put in there and then we don't we don't do it currently So the optimal the question is the says that the optimal mapping should be correlated with time So we use some sort of abstraction of it Basically the number of logic levels in your network right that more and less will tell you about delay We do not know exactly what it is because to do the exactly timing. We need to do the play placing route So actually this is the reason why we are doing this Liberty File parsing where you can get this information back and then try to do and then you can loop around and then use Timing information more precise timing information To guide your logic synthesis optimization algorithms. We are working on that