 Nicola Pouya, who used to be a post-doc researcher at the same university that I'm going to at IT University in Copenhagen, so I'm really, really thrilled about Nicola being here. Nicola used to work for the DEMTEC project at ITU, which was a research project towards developing electronic election technology. In the past, Nicola was a PhD student in the Gallium research team at Inria-Paris, oh, I'm going to butcher this, Rocconcourt, Rocconcourt, with focus on designing programming languages and in particular meta-programming. Today, Nicola will be presenting, giving us a presentation on the experimental language Ling, created in collaboration with researchers from IT University Copenhagen and the University of Chalmer, Chalmer. French was not my main language. Please help me to welcome Nicola Pouya. Thank you very much. I'm really happy to be here today to introduce this new programming language that I've been working on the last few months, but that comes from research done earlier. So programming language design is something deep to my heart. We really need better programming languages. And here, I'm addressing a case where functional programming is strongly typed, the case where it's not enough, the case where this high-level view and modular view is lacking the precise resource management. So let's try to combine this. So as an introduction, let's pick up optimizations. So what do we want about optimizations of a program? This could be manually done by the programmer or done automatically by the compiler. We want those optimizations to be improving the program for performances. This is the first requirement, but it's also at the same time very difficult to guarantee because systems are so complex today that you might be removing some lines of code and still your program is slower. This kind of will affect. But anyway, at least there's optimization should be safe. You should not turn a correct program into an incorrect program. We already have way too much incorrect programs everywhere. There is no need for the compilers to optimizations in general to make more of it. It's better if these optimizations are automatic and automatically done by the compiler. Still, if it's done automatically, it should be predictable by the programmer when it applies and when it doesn't. Let's take a general case of optimizations. Let's say that you have two parts in your program, F, F and G, and that F is reading some input data and producing some intermediate data and that G is reading back this intermediate data to produce some output data. This is happening all the time in the program. In some cases, many cases maybe, the two programs F and G could be combined together, melt together. We're going to call that fusion. We're going to say that we fuse F and G such that it reads the input data and writes the exact same output data. But it's not allocating any of the intermediate data. It's not always possible, but what this language will do is that it will guarantee you that it happens when you think it does. If you think it should fuse and it doesn't, you get an error by the compiler. What you can do is change F to make it a better producer of the data. It's going to be a bit clearer later on. Or you can change G to make it a better consumer. Or you can just decide that this cannot be optimized the way and that we are going to allocate this data. The critical areas in which this kind of optimizations are required and actually done by hand all the time by the programmers is the following. Computation, all the embedded system that are so few resources available, drivers and kernels, and something also dear to my heart, security and cryptography, where not only we want the programs to be fast, but we also need the program to behave independently on the secret data to avoid timing side channels and so on. So what do we have? We have terms and we're not going to speak too much about them. They can provide a form of media programming that I'm not going to speak too much about, but you can think of something much better than CPP or much better than templates in C++. But also they are going to be used for data, like just like this integers or doubles and operations on that. What we are going to talk a lot about is these processes and they're here to describe the behavior, the actions we do on data, that is reading and writing data. Types are here to enforce a safe use of data by terms. Namely, we don't want to mix an int and a boolean or an int and an array. While sessions, they are the types for the processes and they are here to enforce that we get the behavior we wanted. That is if we say that we should read an int, we should not be writing something else or writing an int. We have different types and different sessions which I'm not going to discuss here because it's another big topic, but it's crucial that this has been sought out before in the language because this enables formal verification and convenient formal verification. Last but not least, we have dual sessions which enables fusion, the optimization I was mentioning about, and we're going to see what duality means. Okay, so where does this language come from? I took inspiration from many languages from purely functional, strongly typed languages such as Haskell or theories such as Martin Love type theory and calculus of construction and also fully developed programming languages for proving such as Codd, Agda and Idris. All the process in there is from the Pi calculus for which we might know better, ALA, that is inspired also from the Pi calculus in some extent. And also there's more precisely all the research on session types and linear logic has been inspiring for me and there's two line of works developed by me and colleagues at IT in Copenhagen and in Chalmers in Sweden. Okay, so this is bad. Okay, so this is supposed to be a nice summation. Let's, it's not big of a problem, there is only one there. So let's pick an example with matrix multiplication. I'm going to attempt something. Okay, anyway, so we have two matrices A and B and then we want to multiply. And what we know is that in, this is really not good. Anyway, so that's not much of a big deal, I think. So we have a row for each row of A and each column of B, we are going to zip these two vectors which are of the same size to produce a single vector with the multiplication of those. And this vector, we are going to sum it up. So the functional presentation is, which we don't really see here. But anyway, so we are building this vector that is the multiplication of them and then summing it right away. So we are building it for no good reason, for modularity. It's a nice way to express it, the prime in that way. Okay, so at the end, we should be able to write this program such that we think to allocate this intermediate and imaginary vector, but then it disappears. Okay, so here is the outline. So we'll see a first thing program, then we'll talk more about types and the role as an approximation tool. Then we can cover duality, allocation, fusion, and then back to this matrix multiplication example. Okay, so here is the first program. It's called devil and the proc keyword is marking the beginning of a process. Then we have, we are declaring two channels, A and B, on which and channels they have a session. The session can be this question mark end or bank end. Question mark means read and bank means write. So we must read an end on A, which we do with this let arrow syntax. We give the name X to what we've read in A. The column, the dot indicates the sequence and then we do a write on B and we are writing X plus X. Okay, the typing of this language is ensuring that we are doing a single read and a single write on these channels. Okay, now let's make a new process that is feeding 21 as the input value. So it's a new process that has a single channel on which we should write an end on B and now we allocate a channel A on which we are going to write 21 into A. And then we can call our double program defined previously with A and B. Okay, notice now the equal sign. It means that it's a definition. Therefore, we can always replace double by the definition. So we get the following program where we just place in the double, the definition of double. So here we see the location, the assignment, the write and then the read. Now it's a good case where fusion, a trivial case where fusion can apply and it's going to replace this allocation write and read by a single local definition. And this local definition, there's a really just at the level of the static level. So they can be and will be expanded away, which gives us this final program which is just writing 42 directly into B. Okay, so with this simple program, we've seen how to declare processes, their channels, allocation, read and write. Okay, let's speak about types and type systems. Okay, so if you believe that type system are just here to reject programs, some kind of a bad code telling you that your perfectly fine program has to be rejected. Or that you think that Java, C sharp or C++ are what we have as good examples of type systems. Or finally, you might prefer dynamic typing and say that unit testing can successfully replace a type system. In all these cases, I would say that you're wrong or at least that you should think again. Okay, actually, I think a good type system should be a positive thing. It should explain you why and where your program can or will fail. It should also empower the compiler for optimizations. It's exactly the case that I'm describing with this fusion thing. It is critical that we have the right types for this to happen. And another example is that sometimes we can make the type so precise that it automates the program. That is the type is so precise that there is no need to write the program anymore. But here, let's remember this, that a type is an approximation. And this approximation is here, in particular, to enable an abstraction. So int is an approximation for five and seven, for instance. But an integer that is odd would be a more precise type. Or an integer that is prime would be a more precise type. Or an integer that is equal to five would be an even more precise type. So we could have a wide range of approximations for these types. As I say, the types are here for the data. And sessions are here for the behavior. So we will see what this approximation means for behaviors. Okay, let's take another example. We have a process div mode, which is reading two inputs, A and B, and writing two outputs, D and M. So where D is the division of A and B, and M is the modulus of A and B. So we're going to write this as a process, and we'll see that processes can be combined in parallel and sequentially. Moreover, we're going to give a type or session to this process. And we'll see how this gives a way to have a precise control on the interface. Finally, remember that all the time these channels must be used exactly once. Okay, so this slide is a bit scary. I've put actually six programs, six versions of the programs at the same time. And five versions of the type, such that we can go through some of them. So the type is in green there. And this is the first program that does this read and write, all in sequence. This one with this pipe there, it's saying that we do this in parallel, the two reads in parallel, and then the two writes in parallel. All in sequence, but in different order, or some sequential, some parallelism. We can do many variations. And all of this are accepted with this same type. Which says that we should read on A, read on B, write on D, and write on M. Notice this braces, this tell you that you have the choice of the processing order. Meaning that all of this are valid. Okay, now you can be more precise about the kind of processes that you want to write. For instance, if you say that with this square column, you can enforce that these channels are used in order from left to right. Only this program remains valid. This is making the life of the one using your process much easier, as we'll see later with duality. Okay, you could say I want all these channels to be used in parallel, used independently, except that it won't be actually useful for this situation because we would have to write the division and the modulus without having access to the inputs. We can be slightly more fine-grained, saying that we use these brackets. That mean in parallel, and this bracket column to say in sequence, to say we want the reads to be done in parallel and then the writes to be done in parallel. Which gives this, only this middle program is valid. We can have variations like that. So what we should get from this is, at least remember this different choices we have, like any order from left to right and all in parallel. Okay, duality. So now, we've been thinking of these channels a bit like if this was memory. And this is exactly my intent. But just for a moment, let's think of it as a protocol. If I am writing an end, then the other side should be ready to receive an end. So there is this duality. And this duality will work also with this kind of array or struct or tuple types. If one side is as the choice of the possessing order, then the other side has absolutely no choice. So if we have a session as, the dual is going to be written with a distilled it. So if we send an A, the dual is going to be received an A. Braces, the dual is going, that braces that means any processing order, the dual is going to be the most strict processing order. That is, completely in parallel. And this is from left to right. And the dual is still something from left to right, but we have to dualize inside. Okay, so just a quick example to see how this goes. So we have this pair of an end and a pair of a double and an end. Together with this annotation that tells us that what we should do with it. That is, we should write an end. But we can do this now or do this later, because this is this braces. But at least we have to receive the double and then receive the end. Okay, so now let's see what the other guy has to do in front of us. So we put the tilde and then we work through the definition. That is, the braces become brackets, the bangs become question marks. The in-order sequence stays in order and the question marks become bang. The resulting thing is that the opposite process must receive an end independently from writing this double and end in this order. In particular, it cannot read an end and then send it back on this one. This would create a deadlock. This language, when used with this parallel and concurrency setting, is deadlock free. Okay, allocation and fusion. Okay, so let's pick this general setting where we have two processes, F and G, and they are following something dual. So it could be anything, any session S, and they happen to be following S and the dual of it. So this new syntax, which has this square brackets, which reminds you that it must be used in parallel, forces C and D to be used in parallel and used independently. This is a case where fusion is guaranteed to be able to happen. So in all these settings where you manage to make the two parts of the program be of a dual session, then you're guaranteed that you can remove this allocation away. Okay, maybe you want the same thing, but you don't want the parallelism. So we have this other syntax for new, again, with the same duality requirement, but you can put the processes in sequence. This is slightly more restrictive because you really have to do all that has to be done with C before D. At the same time, this enables the real allocation, the allocation that cannot be fused. Okay, so let's pick an example. So we have two processes, F and G, and they are both using braces. And here I'm introducing a new notation, this karat 100. This is to show that we don't only have tuples, but arbitrary arrays. So here, this guy is going to write integers, 100 integers in some order that we do not know. Maybe it goes from left to right, let's say. This other process wants to read all of this 100 ends, but in potentially different order. And if we don't change F and G, then we have no other choice than to add some intermediate data that is allocated. Then we run F, it's going to write all of the locations in the order it wants. And G is going to read all the integers in the order G wants. So this is a case where if you don't put the alloc flag here, the new will complain and say, the compiler will complain and say, okay, I confused that. So you have to unleash the fact that this is going to be really allocated. If you don't want that, you can change F. You can change F to say, okay, actually can work a bit more and write this 100 ends in parallel. If you can do that, then the whole thing fuse. Or you might want to change G and say, actually, I could read them in parallel. Fine. In this case, this will fuse. Or other changes to F and G. Okay. So we have allocation. So we can go back to matter X multiplication. So I'm not going to get into the details of this operations, but what we have to get is that we can write a zip function in and derive it from a much more general function that works not only for doubles and not only for multiplication, but any three types and any operation F. And in the very sugary version of this language, we can write it as short as this. Okay. The summation also could be done just for doubles and an addition. But in functional programming, we tend to prefer write once and for all general combinators such as this for left. Again, I'm skipping the details, but we can see that it's using a temporary variable to accumulate this results of F, which then becomes the addition on doubles. So now we can put together this zip function and the send function. And we have this intermediary data that is this vector in the middle that I was mentioning, which the theory tells us that we can fuse it away. So a fused version here manually fused version is this, where we see that this is mixing the summation with the zero and the plus and the multiplication in a single function. So some kind of conclusion. We get to combine precision and modularity to get and we get a cost-free abstraction. This particular optimization, which is quite general, it subsumes various kinds of inlinings and other fusion of intermediary data. It is predictable, safe and automatic. So this language and the tool chain that goes with it is completely open source and I hope you will have a look, try, help to contribute. Feel free to ask questions, report bugs and so on. Thank you. Okay, anybody with questions for Nicola, please go to one of the four microphones on the side. Okay, so we're starting with the microphone over there. Yes, is it possible to use this language, for example, to synthesize logic or asynchronous logic, maybe, clockless logic on a chip? I'm not sure I get the question. To use it for what? Synthesizing logic to actually do your computation in hardware. To synthesize? Logic. Logic circuits. Yeah, I mean, so this language and the tool chain with it is really new, right? So there is still plenty of bugs, plenty of not fully implemented features, but the theory can clearly do this kind of thing. So we can have a very high level way of writing things like that. But I think the main domain to which it applies is situation where performances are really critical. So I don't know, maybe this is the case, but otherwise maybe Askel is good enough. Okay, one question over here, please. So I was wondering about the different annotations you could do in the function definition, where you could say like in which order you want to read it, or if you want to read in parallel or in serial. I was wondering, couldn't most of this be deduced by the compiler anyway, if you could do that or not? Because you have specified the flow and it's a pure function. So I was wondering if it's not redundant to have six different functions for different ways of accessing data? Yeah, so as a first step in type systems, it's always easier to check that the types are right than to try to guess and what we say infer what are the right types. So and it's not always what you want, but it's true that we could infer quite a lot of types. Actually, the way the type checker works on processes is actually bottom up. It starts from the leaves of your program and is growing up the type and then checking up that it matches your definition. But it is actually quite important to have a way to not give the most precise type, to have some leeway when you're defining a type, to have something slightly more approximative, and then you say, oh, I can use them in any order, and then you say yes, but I'm going to say left to right for now. Okay, so can I have a process that produces one integers at a time and have them consumed by a process that just takes one integer and just read them all in serial, or do I have to define several versions of my consumer? So maybe, so one thing we cannot really do now is to have a process that will be parametric on the kind of processing order we have. But at the same time, we cannot really write the code in a parametric way, so it's not necessarily working. Okay, I'm sorry, no dialogues, please say us more questions on the other side. One question over here. Thanks for the talk, very interesting stuff. I was wondering, and I listened to this idea, a data flow language pops in my head. So it has been like a massive parallelization of data processing being in your vision, kind of? Yeah, so functional programming is already a dead data flow. In many ways. The thing is that typically, functional programming sort of, it's a success in many ways, but it fails at being precise enough about when and how much allocation is made. And this is what I'm trying to fix here, such that you know that by default, maybe you will use fusion most of the time everywhere and it just like gets out of the way, just like GHC would optimize your functions, but the cases where it fails, instead of just getting slow for no reason, then you at least see precisely where this issue appears in your program. And so you can decide to choose between allocating or changing your program, so you know where these issues are. So I took it to actually make that work eventually. Okay, one question over here on this microphone in the blue jacket. Okay, does the link compiler produce native executables? Yeah, so it's so fine. It's quite early, but it's producing C code, for instance. The C backend is the only one I have. So there is this link language. There are some operations that stays in the same language. And the final one so far is the C backend. It's not complete. It's not nice. It's not optimal. But it's producing precise code that is quite portable. So it's not inconceivable to run bare metal code in link? Yeah, something I've not mentioned is that I've looked at the code that was on the radio badge and I found one instance of matrix multiplication. So I extracted the matrix multiplication from link into C and put on the device. And we can actually change maybe the camera to see the wobble animation. Can we change the camera? Okay, there is one more question on the side, please. So you're saying that link has dependent types. Does this extend to linear types as well? Can you quantify over a process, for example? Yeah, so let's make the distinction between what can be done with the type systems. That is, what can I check? And then what can I translate into C or runnable code? So what can I check is indeed we have normal dependent types, but we have dependent sessions. That means I can say I'm receiving an int x. And then if this x is below 10, then I'm supposed to receive a Boolean. Otherwise, I'm supposed to send something else. So the protocol can depend on actual data. We can also, again in the checking, send and receive processes, sessions, types, functions, and pretty much anything that can have a type. Okay, we have time for one more question from the internet. Is there a question from the internet? The internet is silent. That's a rare thing. Then we go with you. You're the last one tonight. Thank you. When you thought about implementing this language, did you try to come up with an implementation instead as a library or an extension to GHC SQL? Or is it possible to do this without dependent types? It is a really good question. So the dependent types is an addition. So we could have this language without the dependent types. It would be interesting already and would be easier to integrate into Haskell, for instance. The main issue is that there's linearity and there's this handling of sessions. It's really difficult to implement as an embedded DSL. So this could be a deeply embedded DSL with quasi-quotations, for instance, in Haskell. So this is, I guess, a future project. To me, it seems simpler to just write a new language, but I agree that potentially a bigger project would be to integrate it into Haskell or Agdal address. Thank you. Okay. Thank you. Thank you. Please round warm applause for Nicolas Pouya.