 I think I'm in steel proof of optimality in this talk from Felix's talk because I really love that. But today we're not going to talk about DeFi, we're going to talk about multidimensional fee markets. Or kind of the fancy term is how to do dynamic pricing for non-fungible resources. And this is joint work with Alex Evans, Chitra, and Geer-Mirangeris. So the first thing I hope to convince you of is that fee markets with a joint unit of account like gas are actually pretty inefficient. And what we're going to slowly work towards in this talk is a framework to optimally set multidimensional fees. So first part of this is like why are transactions so expensive? Why is having one market not necessarily something that you want to do? And first a little bit of an aside, one of the things that we actually do sometimes see with one-dimensional fee markets are these types of denial of service attacks. And this is because all opcodes have fixed relative prices to each other. And whenever you have a potential mismatch between a relative price and the resources that that opcode consumes, you can get something that takes down a network. So these have been termed resource exhaustion attacks in the literature. And there was a famous one back in 2016 that took down the Ethereum network or essentially made it unusable, did not take it down, but made it very hard to use for quite some time. And this was essentially due to a discreed mispricing. And of course this was patched in a subsequent EIP. If we had a multi-dimensional rather than a single-dimensional market, we might have been able to adjust prices such that there was no need to actually reprice the opcodes after the fact. However, what we're going to concentrate a little bit more on today is throughput, is why having a single-dimensional market is actually bad from kind of a network designer's perspective. And so this is a very, very stylized example that is not at all like close to in practice, but I hope it illustrates the idea. Let's assume that we have a bunch of users that are submitting transactions, some only consume CPU, some only consume bandwidth. The CPU ones have some utility of four, so that's kind of how much utility they give to the user that submits it, and the bandwidth ones have utility of two. And let's imagine we have a block. Each of these transactions cost one gas. The gas price is three, and the block can fit four CPU transactions and four bandwidth transactions. Well, in a single-dimensional market, what happens? We filled this up with CPU transactions, but actually we have a lot of block space that's not used because these bandwidth transactions aren't high enough utility. And something like this can happen, say, with an NFT mint. However, if we have a 2D market and say CPU has that cost of three, but bandwidth has a cost of one, we would actually end up filling up all the CPU transactions in this block and filling up all the bandwidth transactions as well. Like I said, this is a very stylized example, but it does illustrate when you price things separately, or in other words, if resources are orthogonal, they should be priced separately. However, we need a mechanism for price discovery to do this. So how do we decide that CPU is three and bandwidth is one? Or what are those prices even mean? How do we get to those prices? And I'm going to do a little bit of an aside in that I've been throwing around this term resource a lot, and there's a question of what exactly do I mean by that. The working definition we're going to use for this talk is anything that can be metered. So a resource is anything that I can say how much of this thing a transaction uses. So for example, one thing right now, roll up data. However, we could also talk about big resources like compute, memory, and storage. We could go down to the opcode level and think of each individual opcode as a resource. However, we could also say sequences of opcodes are resources. For instance, if you're calling a hot storage slot versus a cold storage slot, that's going to be cheaper. So maybe if we have several store job codes all in a row, that's actually a different resource than calling these one by one. Furthermore, if we're running full nodes on multicore machines, maybe compute on core one is a different resource than compute on core two, and so on. You can imagine this is a very general construction. Resources can be very dependent on each other. And so as long as they can be metered or we can say how much of a resource a transaction uses, that fits into our framework. So to formalize this, and this is kind of where we get a little bit into the math, we're going to say a transaction j consumes some vector of resources. And there's m resources, and this vector is aj. So essentially, the ith element of that vector is going to be the amount of resource i consumed by this transaction j. And now that we're starting to build blocks, we're going to denote this vector x. That essentially is this 0, 1 vector. And we have n transactions, xj is going to be 1 if that transaction is included in a block and 0 otherwise. So this allows us to very easily write kind of the quantity of resources that's consumed by a given block. And we're going to denote that y. And all this is is summing up the vector of resources that's consumed by a particular transaction times xj. And xj, if it's 0, then we're not going to include this in the sum. If it's 1, then it's going to be included in the sum. And this can be written in kind of this very convenient matrix vector notation as well, where the columns of A are going to be the resource vectors for each transaction. So now that we kind of have a notion of a resource and what each resource is, we can talk about things about constraining resources, targets for resources, and charging for each individual resource. So we're going to first define a resource target, and that's going to be this B star. And then the deviation of the target, based on what I introduced earlier, is just Ax minus B. Remember, Ax is the resource utilization of a particular block. And in Ethereum, this is one dimensional, so this is going to be a scalar, and B star is just 15 million gas. We also want sometimes a resource limit that says how much, or after a certain point, a block is invalid. And then we can have transactions satisfy something like this. Where Ax has to be less than or equal to B. And in Ethereum, this is 30 million gas. So again, in Ethereum, this is all one dimensional, however, we're extending this to a multi-dimensional case. Finally, this allows us to talk about prices for each resource. So we're going to have some vector P. This is going to be an M vector, and Pi is essentially going to be the price of resource I. So that allows us to very easily write how much a transaction costs, which is just the dot product of its resource vector and P. And then this is split up into the sum here. One thing here is when I talk about prices, this is going to be the amount burned by the network, or essentially the price that the network charges for a given resource. So I think like EIP 1559, it's not actually going to be the price that users pay, say, validators for inclusion in the block. So nothing about tips here. This is just going to be purely the amount that's burned. All right, so we set up all the math, which is great, but we still have to go back and say, well, how do we actually determine how to charge for each resource? Now, there's a number of very reasonable things that we want. If the utilization that we have is equal to our target utilization, we probably don't want a price to update. That seems to be kind of like a good price. However, if we're over the target utilization, we would want the price of that resource to increase, because we want to make it more expensive, so people decrease their usage. And if we're under vice versa. So a number of things have been proposed to this end. One proposal from the theorem research forums back in January was this price update rule here. You can kind of go through and basically see that it does satisfy these properties that we want. However, I could write down a bunch of other price update rules that also satisfy these properties. So this kind of begs the question, is this a good update rule? Or what is this update rule actually doing? Are there other update rules that are better or have different behavior? How do we go about analyzing this? And kind of the punchline of this talk is that all these update rules are actually implicitly solving an optimization problem. And a specific choice of the objective, which you can think of as how the network designer wants the network to perform of that optimization problem, is going to then give you a price update rule. So this means essentially what I kind of want to convey is a good way to think about price update rules is not like, oh, how do I design the best price update rule? It's what do I actually want the network behavior to be? So kind of what is my objective? And then from there, we'll show how to get to the price update rule. So this brings us to what we call the resource allocation problem. And the setting for now is we're going to pretend the network designer is omniscient and gets to choose all the transactions in each block. I know this is entirely unrealistic, or not even unrealistic, it's just absolutely false. However, this is going to allow us to build up a very useful mathematical problem that's going to get us to the price update rule. So there's a few things that we need for this problem. First, we want a loss function that's defined by the network designer. And this loss function essentially is going to be the unhappiness with the current resource utilization. So there's a few very reasonable or potentially kind of silly loss functions that we could choose. One is this. So the loss function of y. Remember, y is going to be the resource utilization of a given block. Maybe it's 0 if we're exactly at our target, and it's infinity otherwise. Another thing we could do is we could say that, actually, we don't care if we're under the target, we only care if we're over the target. So we can say, OK, the loss is 0 if we're under the target, and it's infinity otherwise. Again, these might not be what you actually wanted you in practice. Potentially, you want to have something where, if you're a little bit off the target, you're not that unhappy, and then it grows, say, quadratically as your deviation increases. But the whole point is we only need something that tells us the unhappiness with the current resource utilization. Then we need some way to say, what is the set of allowable transactions? And in this, we're going to encode all constraints in the set S. So this is going to be this binary set that encodes things like network constraints. So earlier, a block in Ethereum is invalid if it's over 30 million gas. However, there's a lot of complex interactions among transactions as well. So for instance, if a lot of searchers are all trying to get a specific liquidation, only one of them can get that liquidation. And this can also be encoded in this set. So this is a very general kind of object that just says what transactions are OK. We're going to do, this is kind of the first mathematical trick that we play here. And this isn't that important, but it's essentially, instead of considering S, we consider what's called the convex hull of S. This just means that instead of forcing X to be 0 or 1, we allow X to be a fractional value between 0 and 1. So the way to think about this, and the way this kind of makes sense, is from the network designer's perspective, you care more about the average case, or the average kind of usage of the network, not one particular block. So say if XJ is a fraction, that would say that we include that transaction after roughly 1 over XJ blocks. And we'll see that we can actually remove this constraint in a little bit. But again, this is just to set up kind of the mathematical formalism. So this won't really matter in a bit, but I just want to be complete. All right, the final thing that we need is we want to know how much utility a given transaction gives to the joint user and validator set. We group these two parties together into what we call the transaction producers. And the reason we do this is because we don't want to deal with kind of the game theoretic analysis of looking at bids and auctions and that type of thing. So we assume that kind of this group of people is together, they're submitting transactions, and those have a specific type of utility. You'll see that our mechanism actually doesn't matter that we group these things into the transaction producers, but this does present an area for future work. And I'd like to point out that we almost never know Q in practice. It's more or less impossible to know that. However, we will see that this actually doesn't matter once we write out this problem. OK, so a lot is set up, I'm sorry. But this is kind of where we get to. So what is the resource allocation problem? It is to maximize the utility of transactions minus the loss that the network has, subject to kind of the resource utilization being defined by the included transactions and the transactions being allowable. So this is the ideal kind of best case scenario of what we would actually like to solve. For all the reasons I mentioned earlier, this is not something that we can actually solve in practice. But we'll see that it turns out to be a very useful starting point. And again, this is because the network designer doesn't include, say, which transactions are included. Q is unknowable. You can't partially include transactions, all of these issues. However, we'll see that we can actually pull from a branch in math called convex analysis and specifically duality theory to take this problem and turn it into a way to set prices so that the validators and users, or the transaction producers, implicitly solve that optimization problem without the network designer needing to really do anything. Just update prices in a very simple way. So the 32nd version of duality theory is essentially it allows us a way to relax constraints into penalties. So I can say that you actually don't have to satisfy this constraint. You just have to pay for every unit of violation. And this allows us to take y, which is what the network designer cares about. That's the throughput. And decouple it from the transactions that are actually included in the block. There's just going to be a penalty for these two things not matching exactly. However, what strong duality tells us is that if we correctly set the penalty, this penalty being the prices, then the dual problem is going to be equivalent to their original, the original being this problem, which is what we actually want to solve. And these two utilizations are going to be equal. And they're going to have the same optimal value. So again, this tells us we correctly set the prices. And we solve this problem without having to know Q, without caring about the fractional transactions, without all the things that I mentioned are issues. So kind of turn the crank of the math a little bit. And you can decompose this dual problem. So the dual problem is to maximize this thing, or sorry, to minimize this thing, into a network problem and a block building problem. And P is going to be the dual variable that's going to connect these two problems together. So again, those are the prices. And the prices are essentially a penalty that we pay per unit violation of this constraint. This first term here is actually easy to evaluate. You'll probably just have to trust me on this. It's this object called the Fentral Conjugate. It's something that we have in closed form. And that means that essentially this can be run on chain. However, the second term is a little bit more interesting. So let's look at it in a little bit more detail. What is the second term actually saying? Well, it's saying maximize the utility minus the cost. So this is the net utility, subject to the transactions that we can actually include. This actually has the same optimal value if we just use S instead of the convex hall of S. And this is exactly the problem that is solved by the block producers. So what does this mean? The network never has to solve this problem. It could just observe from the previous block, which transactions were actually included. And then it gets the solution of this for free from the decentralized block builders. So what do we get at optimality? Well, if we assume the prices are set correctly, so that's going to be p star. And then the block builders use those prices to include essentially the transactions that are optimal. Then what do we get? Well, we get that the resource utilization of the network is exactly equal to that of the block. Again, back to what I was saying earlier is that we essentially get that this constraint does hold at optimality. And why satisfies this, which we can look in a little bit more detail, what this means is essentially the prices that minimize G, so this is the dual problem, charge the transaction producers exactly the marginal cost faced by the network. So if you set the prices optimally, for whatever loss function that you define, the marginal cost of using more of that resource is exactly what the prices that you charge. Furthermore, these prices are going to be the ones that incentivize the transaction producers to include transactions that maximize welfare generated minus the loss incurred by the network. So that's back to that original optimization problem that we saw. You correctly set prices. You solve that problem. The network designer doesn't need to know the utilities or anything like that. All right, so OK, that's great. I still haven't told you how to choose prices. I've just kind of talked around this for a while. So how do you actually do this? Well, we can compute the gradient exactly. And what is this? Well, essentially, the network can determine this y star. And I said earlier that this is computationally easy. Then this x star at some current price is found by observing the transactions in the previous block. So then all we do is we apply our favorite optimization method, like gradient descent. And we update the prices using this gradient up here. There's a lot of other optimization methods that you could choose here. They're going to have different convergence behavior and different trade-offs between, say, convergence and complexity. This is all stuff that we leave for future work. Just to go through simple examples of what I showed earlier, so let's say you had this loss function looks kind of silly. You actually do get somewhat of a reasonable update. So this looks like the residual. So you essentially just update your prices by some fraction of the residual. If you use this one where you're only unhappy if you're over the utilization, you have the same update, except you essentially make it so these are non-negative. So if any of these are negative, you zero them out. And so you can actually see that this makes sense. Here, in this first loss function, we're unhappy if we're under the utilization. So this means that we might actually want negative prices to incentivize people to use more of a particular resource. Here, though, we actually don't care if we're underutilizing based on our target. So we're never going to have negative prices. And again, this is to kind of get to the point that the network designer chooses the loss function. And the loss function encodes exactly what your unhappiness is with a particular resource utilization. And then once you do that, update rules that will maxima, or sorry, minimize that loss function will fall out of it. So it comes down to instead of choosing what the update rule is, you're choosing what the loss function is. So this is a lot of math. We did some numerical work to kind of see how this would work in very simple examples. So here, we have kind of a steady state behavior of a network with only one type of transaction that's being submitted. You can think of this as pretty analogous to that example that I showed at the beginning of the talk. But you can see that one dimensional prices are doing about 10 transactions per block. And multi-dimensional prices are able to eke out maybe two to three more transactions per block. But this is even when it's the same type of transaction that's going through. So even in this kind of the simplest case, you get some improvement from using these multi-dimensional prices. And I'm not going to go through all the details of how we set this up, but I would encourage you to look at the paper for that. However, where this really shines is when we have a distribution shift. So in this example, what we did is we have this type one of transactions, which you saw earlier, but we add this type two transaction, which has a much different resource profile. So you can think of this like an NFT mint or something. And they come at about block 10. So the multi-dimensional prices are the purple and blue. And you can kind of see that there is this nice spike here where we do less of type one transaction, more of type two. And then once we go through all of these, we return to zero. And we clear some backlog after we've gone through these. You can also see on the right here, these are the multi-dimensional prices. So once we hit block 10 and the distribution shifts a lot, this light blue price goes down, the other price goes up quite a bit, and then they return to steady state a little bit long, a little bit afterwards. Again, the uniform prices are still able to adjust a certain extent, but it's going to be less throughput overall. And back to what I was talking about earlier, where you have some target utilization that you want to use, you can see here in this second example, the dashed lines are going to be the targets. So the multi-dimensional prices, which is the top, deviate from the target for a short period of time to handle kind of the big spike in transactions. But then the network returns to steady state afterwards. In the uniform prices, where you can think of you have less dimensions for this controller, you essentially get this oscillating behavior, and you eventually return to somewhat of a steady state, but you kind of get a lot of a mess right here as the distribution shifts. Cool. So there's a lot of future work to be done here. One thing that we didn't do is super extensive numerical examples, and you can imagine that using real data here might lead to valuable insights that allow you to tweak the framework in specific ways. In addition, like I mentioned earlier, we group the transaction producers, or we group the users and validators into that transaction producer kind of set, and there is some work on the dynamical behavior, like essentially how do we make the strategy proof, how do we kind of, we just talked about essentially the amount that was burned by the network in our prices, so how do we kind of make this into an entire system. Also, I mentioned earlier that the update rule, while I chose gradient descent here, there's a lot of other things that you could do. You could actually choose the update rule in a way that gets you something that looks very, very similar to what was proposed on the Ethereum research forums back in January, and there's a question of, okay, well, which update rules are good, which update rules are the most useful, and how do you trade off between, say, convergence behavior and complexity, so how quickly kind of your prices can adjust and how much work that you're doing on chain. Then, of course, on the system designer side, so if you're actually trying to use this in practice, there's a lot of questions that this general framework doesn't totally answer. So for example, what should the actual resources be in a given system, and how do you trade off kind of the complexity, pricing every opcode, every sequence of opcode, and so on, and the ease of use of these things, and then, of course, how do you determine a loss function for the desired performance characteristics? Again, kind of the very important point here is that system designers should be thinking about these questions, but should not necessarily be thinking about, okay, well, like, how do I do the exact update rule for prices? Because in this framework, if you think about these questions, then the update rule falls out quite naturally. All right, and I encourage you to check out the paper, which has a lot more and is kind of like 38 pages that goes through this entire thing in excruciating detail, and I'm happy to take any questions. Thank you. One, two, yeah. Seeing as you're in your models, you're willing to give a different cost to different opcode and resources and everything, I think something that could be interesting to see is the order of a transaction itself if I have different costs on that, especially of storage. Like, for example, that MEV is a thing and generally speaking, if you eat a storage slot earlier, it has much more potential value because this menu being settled first, what would be the impact of costing differently the storage of codes, depending on where you are in the block? And generally speaking, in terms of resource utilization too, anything that happened earlier in the chain is more costly for the network as a whole because you need to store it for longer. And I don't think your framework is straight up compatible with this costing because it's missing one dimension on the vector of cost, maybe I'm wrong. Yeah, that's actually a great question. I think it's compatible with the first but not necessarily the second or the second one you could probably put into it but it would be a little bit harder, this kind of multi-block one. However, for a single block, you could actually view, that set S can saying like, this transaction goes before this one or this transaction goes after this one. And then perhaps if you're the second transaction in the block, you're actually using a different resource. So you're using like the second read of the slot. That would be extremely beneficial, I feel from a network if you can convince more people of going this direction because this mean lower cost for everyone and better utilization for everyone. So very good direction, I'm really happy to see. I'm not saying it's easy to implement though. Oh no, it's very difficult to implement but even if you already break, like the implementation is not gonna be possible. Thank you. So do you see this research being applied directly for example in Ethereum in like, I don't know, 50 years or maybe sooner? I think there's probably some people in this room that could answer that better than I can. But I think we've gotten a lot of interest from different protocols that are, rollups, et cetera, that are interested in this and from the Ethereum research team as well. I can't speak to development timelines though and when this stuff would be. Like I said, there's definitely quite a bit of future work that has to go into making this production ready. And I imagine that, you know, newer chains that maybe aren't as, don't have to pressure test their changes quite as much. We'll probably adopt something like this before Ethereum would. Hey, Athea. So in the interest of keeping this a convex optimization problem, are there any limitations this puts on how we can construct different parts of the problem? So such as the lost landscape or have I missed something and is there kind of a chance we can land in a local optimum rather than a global optimum here? Also a good question. So the lost function has to be convex. So there's one immediate thing that you have from, and you can imagine that maybe, I don't know if there's two states that you want to run in and maybe sometimes you want to go hit for one target, sometimes you want to hit another. That wouldn't be convex if you kind of your landscape looks like something like this. So there definitely are limitations. The other thing here is we kind of have this, the resource part is very general. And so the question earlier that you can kind of have these resources that are dependent. So like one transaction can be dependent on another. However, we do encode this all in like an additive linear way. And there's probably not the most efficient thing to do. For the reasons that I talked about earlier is you kind of get this exploding complexity as you do that. However, if you don't do that, you get to kind of the non-convex world. So it might be a more succinct or lower like complexity way of describing what it is that you want to do, but you won't actually be able to solve it. And this entire framework does rely on strong duality, which you mostly only get in convex optimization problems or it's very rare to get it in non-convex problems. And that allows us to look at the prices, which is kind of the dual of that. Instead of looking at like what transactions do I include, I look at how do I set prices? But yeah, that's a great question. Thank you. Thank you, Dio. Thank you. And now people, let's go to the top of the mountain to find where it's going to be, Devcom 7. See you there.