 Hello and warm welcome to everyone joining this session today. We have Neon, Dr. Christopher and Curtis, which we are hoping would join soon. Yeah, they're with us to talk about building and solving some type safe optimization models and Pascal, which in short is going to be an interesting session for us all. So without further delay with you, Neon and Dr. Christopher. Thanks. Thanks again for having us today. So today we're going to talk about building and solve of decision models with functional programming and high school, of course. In this case, Haskell. So first, optimizations, what is it? Well, like, we have some functions and then with a given constraint and then we need to minimize it, right? And that and it has sort of every sort of application you naming resource allocation in computer. You have machine learning also opposition, you have like image processing and reconstruction problem, etc, etc. And we're going to talk about applications of application of opposition if we are usually revolves around a algebraic modern language. I am on this case. And in this case, so first of all, we have the input models, right, is the optimization problems that we want to solve. It can be in image processing machine learning, or if you have this model, you need to optimize. And then on one on the other hand, you have this algebraic modeling language where you will, you know, describe your model in. And then there's also another part, the optimization solvers, like someone wrote a very good optimization solver that can take your it like a compound decision and then try to optimize it. So, my principle method, etc. Anyway, so with the current, current state of algebraic modeling range. So we identified that there are some problems, like, for example, there's for gams and ampoules, they are like a dedicated like language. They're not embedded a in the general purpose programming language, which makes sometimes makes processing data and then describe model. It's hard because it's niche. Another problem is type safety, some like some algebraic modern language in Python, in Python or Matlab or Julia are like type safety so sometimes we try to run our model, and then it turned out that we have some like type error. We didn't know about, and then we wasted our time. And another thing is that like in, sometimes in those like in pyomo in Python or something, or in jumping in in Julia's models are kept like a capture as programs like they are instruction so sometimes it's really hard to analyze and optimize those models like what if we want to identify common computations, what if we want to use math properties to optimize the competitions. How do we do that with like, it's, it will be a privilege to static analysis of a program which is not going to be easy. And so, so what are our goals. We're going to make this, this step type safe. So, we want to make that like translating the model to the algebraic model language that have that step we're going to make it safe type safe. And in the algebraic modeling language itself, we want to capture the models, we're going to capture all the computations, and then somehow eliminate redundancy and optimize competitions with math properties. And then go three, we want to generate high performance code so we can, like, solve it, shove it to the optimizations, optimization solvers, and then it can operate real quick. So, these are goals. So, how we do that, we, so our solution is like, right, hash expressions. It's a library in Haskell, it's an algebraic modeling language is embedded in Haskell has type safety where like, if you have the idea is like, you can construct invalid expressions and models in general, like, otherwise, the compiler will throw in type error, like, this is not a valid model. It captures optimization models in symbolic form, like all the expressions as capturing symbolic and then we use hash consigns, which we use a hash mechanism to identify all the common sub expressions, and we have terms rewriting to simplify and rewrite computations to like our design form. And we generate C code and C code and then that can be embedded in anything, not just optimization solver, but yep. Okay, so let's get started with the first example of using hash expression to solve a real world problem. So we're going to talk about magnetic resonance imaging reconstruction. Basically, like, this is example when you go to a clinic. And then you want to cap, like, take a picture of your brains. Anyway, what you have after the MRI machine take, like, take operation is like, you have a full signal of your brain. So basically, you get a full signal, but you only get a part, like a partial signal of that, like a portion of that. So which is indicating this mask to the mask. So these, these black spots are where you got the data, these white spot are where you don't. And then there's also some noise, you capture, and then what you need, you need to fight the image of the brain itself. So, why don't I ask, like, why don't we take the, just take the inverse we transform. I mean, we can, but the result is not going to be good. It's going to be a blurry brain image like this. So what do we do. Instead, we set up an optimization problems. We say, we let's minimize, let's find an X, let's find an X that minimize this expression, which is mean say like a free transform X, like, subtract to the signal we receive. And then we ignore the, we ignore the spot we where we don't get the data. And then we add a regularization. And then we also add a, a bowed condition that say that we only care about the fix like pixel inside the brain. So outside, we, outside should only be between, like, the noise boundary, which is very, with the very small value of negative and positives, one, two or something. Anyway, so we set up an optimization problems. We input it into hash expression. So, as you can see that there is the way we input it into hash expression is very similar to the mathematical formula that we have in the first place. You can see the correspondence here that there's objective, there's constrain, and there's a regularization. Everything looks very similar to the formula. And then you put into hash expression is going to does all the calculation optimizing computations and generate C code and then that C code you put into optimization station server. And what you, and then you get the results is the image on the right side here, like, it looks much better and has details and then then the doctor can, you know, do the diagnosis. All right. So that example of hash using hash expression. And then, okay, so now let's get back to how do we do it. So first is let's go back to go to the goal. The second goal is remember the second ago we want to capture the models in capture the model ends, then be able to optimize and then we do redundancy right so how we do it. Of course we have to deal with expressions which is the basic building block here. We represent it, we capture it in direct as a single a single graph like them. And then we have a map of where we store all the intermediate expressions, each entry, where each expression, some expression is a note in the graph and index by that map, and they're indexed by their hash, and therefore allow us to identify expression like common expression will always be indexing the same map. Anyway, this is an example of how we store the expressions. So here's an example, you have variable X variable Y, and then you have all these index in a map by their hash. And like the arguments will be will be the argument of each operators will be the hash of the previous note. As you can see in this example. All right. So this is the Haskell data type that we used to represent that we have an op data type with various constructor, we can have either variable or you can have parameters or it can be a constant. It can be a sum of a list of notes, it can be a product, it can be a scale of a note with another note, with an expression, it can be a complex number constructed from two other real numbers. And a note contains of an operation and element type, either if the real complex or shape. And, and there's a note ID which is just the index of each of the note in the expression map. In this case. Okay, so the question is like, how are we going to like representing expressions and how to combine them. Like, should we use the trivial way would be to use like the map itself, and the note ID as a tuple to represent expressions and use that as a base to build called expression combinators right. Well, then if we're going to go with that. And let's see like how are we going to implement a function to act to expressions. Well, trivial, like, like the natural way would be to okay, if there are two expressions, each with different map, right. The natural way would be to just merge these two maps and then create a new note out of two, out of two existing note, and then we're going to find a hash. We're going to find a hash of that new note by, you know, you know, hashing that note, or if there's a collision rehashing that. And then, finally, insert the note into their union, the merged map and then produce a new tuple, right. It sounds like simple enough and then it should work. But actually the answer is, it's not quite. So, the problem is that if we're going to use merge to map, then there's no, we sometimes there will be a case where hash collisions like in this map, what if what happened if two expressions are hashed. And like, because we, we know that this could be a hash collision and then there'll be a rehash, right. So, what if two expressions are hashed indexed by the same ID. And if we use, if we simply merge these two map and then there'll be hash collision we don't deal with that. And then another another problem is that it'll be slow because every time you combine two expressions. It'll be the complexity would be like oh and plus and where and plus and is the number of notes in each map, right. So, if, if we build expression and note from the ground up from zero to and all the way to end of the n square algorithm so not going to be that fast as we want. So the solution is do like. So we want to build a combination of expression right the solution. Do not build combinators out of data simple data, but build combination is the data's of computations. We want to use this type signature instead, because we want to have a notion of underlying expression map like some bills expression should be on the existing map and then we build on top of that so we don't have to deal with merging and therefore could be to hash collision error. So, if you're familiar with Haskell type signatures you know that this is the case of the state monad. Here is it. And it's basically what we're basically here is like we have some modified version of the state monad we call it monad expression that only allow us to modify the underlying expression map like in by introducing new note, we don't allow arbitrarily modifying the underlying expression map. And what happens is combining two expression, which is going to follow this method like we build the first opera opera and we built the first second expression, we create new note and we introduce the new note. These are the company like what happened here is is in fact, we're combining two computations to build two expressions. So, which result in a computation to build the result expression. So, what happened, the result we handle has collision because the notes, because first notes are introduced sequentially, and there's no merge like we control all their process of like rehashing and if there's hash collision by indexing note like anyway, it's we So, so what happened is like this will happen in note in and law and because in like, we only need to retrieve index note from the map and then insert to the map which only take a lot and so building an expression of and notes from the grow up will only take an analog and so that is how we capture expression and build a combinator of expression so if you want to make in like in fix operators out of that we just make a operator and then then we didn't use that to combine expressions. Okay, so that is how we capture expression, and then let's now let's go back to the first goal and how do we make it type safe because it's we already have type but the problem is now the type does not have enough information, we only have the expression builder right, but we want more. We want to capture the shape of the expression when the capture element type real complex on the type level, not just on the term level, the value level right so what happens we introduce a new type. We call type expression and have and make it have to found type parameters, one shape on the element type that. So, what happened here is that if we have something like this type expression 15 and 1515 and see it means this is an expression of type of complex and it has shape of team times 15 times team. So, from that, we can build a combinators out of the type, we can have a primitive constructor for variables. Here is an example of how we create a new variables. So we say variable 2D of shape 20 times 15. And this will have time expression of this will be a real variable with the shape corresponding. And from that, we can define like all the operators that we can operate on expression but because of the info on the type level, we can provide specification to each operations. Right. For example, you can only add two expressions of the same shape and the same element type, or you can construct a to expression real expression of the same shape from real part and imaginary part in and construct a complex expression of the same shape. For example, here you can say a Fourier transform, it must be a complex expression, and it has, it is going to produce a another complex expression of the same shape. And here is an example, if you try to add to expression of the like different, different shape, compiler will yell like, hey, you can do that, like we expect this type of you providing an instance type. Or another example is, is scale like scaling operator in, in vector space so we can use type family to like defy a more complex constraint. For example, here, like you can, for example, you can use real scalar to scale both real and complex vectors, but you are in complex real can only scale complex vector, but if you try to scale a complex scalar to a real vectors, it will, it will you a type error, which we, which we can express that in a, in a type family and using this is like type level programming in high school. So what will happen here example, if you say, you want to, if you try to scale a, a scalar to a complex complex scalar to a real vector, the types system just simply won't allow you, you get a compiler. Another example is projection operators. This is simply the, the similar at a pie item slicing notations, but only that we have type safety. So that is an example, try to it's a project project that but you get the index out of range, the type, the type system will tell you. And so, so with that, we are made the expressions construction type safe. And because of that we can catch the errors early. We can ensure that okay we have this model, we input it, it must be correct because otherwise the type of the if we would get a compiler and another advantage is, we can use typo and, for example using BS code Haskell plug in like, we can just put in a hole and then the editor will tell you like okay you need to put in a value of this type. So that would like assist with development in general. Okay. So let's go back to go to right. As I mentioned, like we already has a ways of hashing expressions and therefore, we identify common expression in index them in same notes of notificate expressions and what, what else we can do. So let's go to the next step, because we have the symbolic capture like symbolic representation of expressions, we can then perform rewriting and simple simplifying expressions. So, this can mean that we can obtain equivalent but simply expression. And therefore reduce evaluation time. And also we can share competition more than that like for example with these two expression x plus y plus z, and XBZ plus Y, they're both we written to the sum of X and Y and Z, then boom, they get indexed to the same note and competition is shared. In rewriting and simplification, we develop a rewriting system. We added a built in domain specific DSL domain specific language for rewriting based on matching and replacing in our library, we can write code like we can write code like this. To say like, if you get a real part of like, of a expression that was constructed from real part in an imaginary part, simply rewrite it to the just a real part, or if you use multiplies to a complex number of zero part zero real part and zero negative imaginary part just simply rewrite it to zero. No need to evaluate it. And under the hood. This is also we use also the monad expression we introduced earlier. And, but not only and also like not only we can simplify expressions, we can also identify special patterns, for example in neural networks. We know that this is a sigmoid function, right. What happened is, we can identify this pattern and then we can rewrite it to a special note called log sigmoid. And then, when we generate code, which we'll talk about later, we can have special instruction or hardware accelerators to evaluate that note, a lot of sigmoids, for example. What next. So, the one, another important piece of algebraic modeling language is to compute derivatives, because to solve optimization problems you need to a way to evaluate derivative right and what happened is we use reverse accumulation method. And, but we produce like this is actually the common method using automatic differentiation, but instead of producing numerical value of derivatives we produce symbolic derivative expressions. And these expression will be index in the same note and added to the same expression map with the original function. And under third how do we implement it still more net expression that we talked earlier. So here's an example, if you have this expressions here is the graph here is the expression graph right using the reverse method, you familiar with it which is traverse the graph in the reverse order. And then we into and we then like calculate all their partial derivatives with the chain rules. And what happened here in hash expression is all these intermediate expressions will be added to the original expression map, and they're get simplified and they get index and they get hash and what happened is what and then what happened next and what happened in the end. So, as you know, as we all know right like computing derivatives with chain rule. Well, we often end up with this multiplies with one right so like trivial competitions like that will be simplified away by the rewriting and simplification system. And since, and then what happened is like the derivatives derivative and the objective function will be index in the same map, and then there will be competition share between them here is the example of that. So we have multiple notes share between the objective and the derivatives. So here is the expression graph that they combine of derivatives and the function and the objective function of the of the MRI problems that we mentioned earlier. So, I think it's look kind of nice. Another point I want to mention is that like, when we talk about symbolic expression, computing, computing symbolic expression derivatives with us producing symbolic expressions. There's often time associated with the problem expression swelling because it's like for each expression, there will be many other expressions of this derivative. And, but that's not the case of hash expression because we work on a single single single expression lookup table and because we have a hash content. So, expressions will be in like index in the same note and we avoid this problem expression swellings. And so that is, first, the go one and go three, go on go to and now is the third goal, generate fast code for optimization software. So what happened here is when we have captured the computation of the problems. So we have a single map of all their objective, the constraint, the derivative of the objective, the derivative of the constraint all share in the same graph. We can then generate the code to evaluate them like, it can be a function to evaluate everything it can be a function to evaluate just the objective it can be a function to evaluate just the objective and gradient. And that code can be used by optimization servers. Like, you think about gradient descent or fps or IP off. And in this version we, we generate a, we implemented go generation that generate c and c++ code. And that we use simple memory allocation where we allocate allocate memory for each every note in the graph. And then we then we simply evaluate everything in the top logical order. In this version, we only, we only use simple follow up, and also we use FFTW3 for Fourier transform to evaluate Fourier transform, Fourier transform. And how do, and then with the generation generated codes. It looks like this. So it has automated data of the each variables. Where is the offset of the objective where it's like to look up the objective where you look at, and there's a function to evaluate each target we want to evaluate, evaluate. And we have that problem that C file. And then what happens next is whether we're going to write adapter that's connect the C code with optimization software code. And we have implemented adapters for these server gradient descent, lfbfgs, lfbfgsp, and IP up. And, and then you take that adapter you compile with the optimization server and you compile with the problem.c code to generate and then run it and then we get the final result. So for future work, because, because we have captured the expressions, and then we allow multi dimensional variable. I, we have a lot of opportunity to for parallelization, like, because of each expression has a like each expression is sometimes it can be treated as a grid of number so we can use for point wise operation like addition or multiplication we can just use. We can use like embarrassing parallel like SIMB or GPU. So the up parallelization opportunity is we can analyze the graph, and then from that we can generate code that spawn multiple thread each generate a independent path of the graph and then merge them together, for example here. To summarize, so there it is so we have built a language that help us build and solve optimization models. We have made it type safe with type level programming. We have made it. We have captured the models and then be able to eliminate redundancies and then simplify computations. And we have implemented the C code generated to make the whole thing, performance and pluckable to optimization software. And the library is open so in this GitHub address. So, that concludes my introduction of passion expression. Yeah, thank you for listening. I guess, question time. Yes. Thanks a lot, Neon and Dr Christopher for this wonderful session and what type of applications have we tested this for you mentioned a number of applications so far. So, this is a very generic framework right so we can use it any, any application that you can model it as optimization so problems, you can use hash expression, like, it's very generic. It's not tailored specific to any application. So, for example, a, if you want to train your network, you can express it in hash expression, and you can fit it into optimization software and train it. So, we, we have code, we have we, if you check out the GitHub we have several examples of how to use that for simple as logistic regression or neural network or MRI reconstruction. We have all the examples. So, in another project, we're currently using this to do instruction scheduling. So, that's a pretty experimental application. If it works out, then you'll probably hear more about it. But it's, you know, it's, it's good to have the type safety in that application. And this is where the embeddability is really important because so we've got a, you know, got an existing compiler. It's does a lot of work in order to figure out, you know, what the, what the dependencies anti dependencies are for code for instructions that need to get scheduled into a loop or even into a basic block. There's constraints on low constraints as well. And so, you know, most of the code is about figuring out what the optimization problem is. And then the optimization problem is relatively straightforward. I mean, it could be big, but it's got a simple structure. And then we want to be able to read the result back and do something with it, right. So it's not an example where you want to know, I don't know, what's the best portfolio and then you get the numbers and buy your stocks or bonds or whatever. We have to do significant pro post processing with that result. So, in the case of instruction scheduling, you may have an instruction schedule but then you can't do register allocation. So then you'd want to set up a new optimization problem. And try to come up with a solution that would be that you could register allocate. So the other thing that we had on a pre in a previous library that didn't. The code was messier we didn't have the features that Haskell has now, but we did have physical units. And we also had exterior differential forms. So, you know, this code is much, it's much more readable it's the performance is much better. It's got property based testing and things that the old code didn't, but we need to keep putting some we would like to put those other features in there. There are no more questions here. So we can end this webinar here.