 Awesome. Welcome everyone. Let's get started. My name is Mariko Shevsky. Today we're going to be talking about shrubs, which is a new sea-cash-style gas-efficient privacy protocol that you can implement as a smart phone. So this is joint work with CELA Labs and MATRO Labs with the following folks. Alexander Vlasov, Alex Kupovsky, and Ronald Traumer for being a good partner in the midst of it. Cool. So let's begin. So first of all, why privacy? Why shrubs at all? So maybe just some quick context. So CELA is essentially a mobile-first kind of fork of a theory which allows us to build kind of a user experience like this. So it has a lot of focus on the lifeline. So we use ARK-based proofs to prove that an editor is part of the chain. We have a stable coin that's backed in part by ETH, which is why it's relevant to the ETH community. And then we have a way to send payments to phone numbers. So all of that means that if you want to send a payment to a friend, say you want to pay someone back for coffee, you can find them in your contact list. And since CELA lets you send payments to phone numbers, in a decentralized way, you can quickly send, say, five CELA dollars to them. And since it's 2019, of course, you can also include emoji. And critically, you can pay for this transaction using the stable token itself. And so the whole experience is quite quick. It's fully decentralized. This is running a lifeline that sends using that basic lifeline protocol. So it allows you to recreate kind of a Venmo or PayPal-like experience that is fully permissionless, fully paired here. So that's CELA. And given that it has this really big focus on sending things that are useful as a medium of exchange, sending stable value, we hypothesize that privacy is actually even more important for such a network of Venmo for coins that are primarily store value coins. And so we spent a fair amount of time thinking about how we can add privacy to CELA. And again, CELA is a corporate Ethereum and it has the EVM. And so everything that we're going to talk about today is completely applicable to Ethereum. And so we had three goals. We wanted to make the balances of these transactions fully private. And since we didn't want to linkability, we wanted to keep the recipient and sender addresses fully private as well. And so this is, I think, isn't that common today in other Ethereum-based privacy solutions. And then finally, we wanted all of this to work really well on mobile. CELA, again, is mobile first. And so we wanted them to work nicely on mobile. So to reach these goals, we decided to look at Zcash and start porting Zcash into a split-page smart contract. And we quickly ran into a few issues. So number one, the gas cost is perhaps not surprisingly quite high. And number two, the mobile support is actually not that great. And so for this talk, I'm going to mostly focus on the former and just kind of hint a little bit about the possible solutions for the latter. So maybe just some super quick background on Zcash. Just to get everyone on the same page. So in Zcash, there's a number of operations, one of which is this shielding operation. So that's when you take one public stack and turn it into one private stack. So this is a shielding operation. And the way that you do that is you create a serial number and then you sample some randomness or some more. And then you can compute a commitment to that serial number, Cm. And then you create a public transaction that shares that commitment while spending one ZEC. And the miners will add that transaction. We'll take that ZEC, put it into a escrow pool, that bats basically the private coins, and then adds this commitment to what's known as a commitment to Merkle Tree. And once that commitment is in the Merkle Tree, then it exists, and then once that exists in this private world of Zcash. So here we go, we add that, we add it to that. So next up, how do you then spend this private coin? I think one of the easier things to understand is how do you turn it back into a public coin? So that's called a D-shielding. And I should add that for both of this and the last example we're talking about this kind of simplified instruction that exists in zero-cash paper. It's actually a little more complicated than this, but just for purposes of making it easy to understand, you can kind of focus on this simplified instruction. But as we're kind of turning this back into a public ZEC, a T-adress ZEC, you have to reveal the serial number to your commitment. So let's assume we're using this commitment over here. And then in zero-knowledge, you have to prove that you know of some randomness, you know of R such that this commitment, Cm, appears in this commitment of the tree. And the way you do that is you construct a miracle path that goes up to the roof. And inside a snark circuit, you effectively prove that this miracle path exists and is valid. And then other full-knowledge reminders would then take this proof, verify it, and they verify it by giving to the proof the same miracle tree which they themselves have. And if the proof verifies, then they add the serial number that you just revealed earlier to what's called an alifier set so that you can't stop to spend that commitment again. And then it gives you the ones that are back from the S-Pro contract that are basically back from all the private points. And so if you wanted to then transact privately, you're in effect doing the shielding operation and the shielding operation back from back to back. So you first approve that you have one of these commitments and then you create a new commitment usually tied to the recipient's address. It's not more or less made sense. So, why is this difficult to do in this contract? You know, the answer is it's not difficult, but it is expensive. So if you were to use a tree that's around 32 levels high, then, and this is what Zcash is today, ZeroCash recommends a bigger tree, but Zcash has been constantly increasing the tree size with every upgrade. But if you were to have a tree of this height, then you would require around 1.7 million gas just to make an insert operation into this tree. So not even counting the kind of gas needed to verify the actual proof. And this is assuming you're using the MMC hash function, which we had to use in order to have efficient grooming times on local devices. So if you use SHAW 256, this would be cheaper, but it would just be very, very expensive to actually construct the proofs and construct the transactions on your phone. So we implement the MMC in as much contract. And if you include that, then it takes about 1.7 million gas in one of these structures. And this is even a way of optimizations to the tree structure. So we spent a bit of time trying to make this quite efficient. So again, why is it expensive? Well, if you insert one of these leaf nodes, you have to compute the hashes going all the way up to the roof so that you can update the latest workable commitment. And so that requires each hashes and depending on how you're storing the tree and how you're optimizing it, it's potentially up to each storage updates, which is pretty expensive. And then, of course, as you make the tree bigger, you can support more total transactions. This becomes even more expensive. So how do we improve on this? We came up with people calling shrubs. And the basic idea is we have an variable variant of a miracle tree, which we're calling a miracle shrub tree. And in this data structure, the tree is not defined by the root, but it's instead defined by the path to the right most non-empty node. So that means to be basically the frontier of this tree that we're filling from the left and going to the right. And as an optimization, instead of storing that entire path, since some of the nodes on the path may not have children on the right side, we can actually store fewer nodes than all age nodes. And so we only store the nodes needed to actually calculate the hash of the children's nodes along this path. And we call these nodes required to reconstruct this path as shrubs. And we call them that because if you include the path, it looks a little bit like a shrub, not quite a tree. And so what does a proof look like using this new data structure? So you basically have to construct a miracle proof that doesn't end up at the root, but instead ends up at this frontier path. So say that we want to prove that this node is part of this set. Then you construct a miracle path that doesn't go all the way to it, but it goes to this node, which is part of the frontier. And so to do that, you need this node, this node, and then you compute that. And then you can compute that and verify that it matches what should be there. And then if you want to verify this proof, you need to know the frontier. Instead of knowing just the root of the limit, you want to know this whole frontier or, again, as optimization, the shrubs that you can use to generate that frontier. So let's look at another example. Say that you want to verify this guy, if you go all the way up to the top, we'll intersect the frontier at the root. And that's fine. The root is part of the frontier as well. So the algorithm works just before. You just need a few more nodes as part of your miracle proof. And so in fact, what we're doing is we're trading off the amount of work that you have to do to perform these inserts with the work that you have to do to verify these miracle proofs. To verify a miracle proof, we're doing more public inputs, which makes the verification a little bit more expensive. But as we'll find out, the trade-off is actually well worth it. So it all comes down to the cost of inserts. So what does an insert now look like? Again, that was the expensive thing in traditional miracle tree. Say that we want to insert this node over here. So we just insert it into the next slot as we're incrementally building this tree from left to right. And then we have to compute all of the hashes and update all of the shrubs that are needed to construct this new frontier that exists. So if you look, these are the shrubs needed to construct the frontier before we added this node. And these are the ones that are needed to calculate the frontier after we added the node. And since the only new shrub is an actual leaf node that we have already added, you actually only need one update in zero hashes to perform this insert. So that's pretty efficient, and that translates, obviously, into very good gas savings. Now, we're not always going to be so lucky. So it turns out that the actual amount of work that you have to do in this data structure depends on where you're inserting into the tree. So let's imagine that we're inserting into the next leaf. The... These are the shrubs now needed to construct the frontier that I'm actually not showing, so the frontier would go to the right, probably, and then off to the left. And since we have to calculate the value of this node, we need the next left child and then, ultimately, this leaf, since that's the only node contributing to that. Since we already had this in the previous rub, all we have to do is insert this node, which we're doing as part of the insertion. We have to update this, and to update this, we need to compute the hash of this and the hash of this. And so that means that you're doing two updates into hashes to insert this. And so, in the expected case, you end up having to do about 1.5 updates in the hash on average. So, obviously, in the worst case, you may have to go all the way to the top of the tree, but, interestingly, in the expected case, this is actually quite the chat. So, what's the performance? So, we actually implemented this in a sort of a contract, and you can access it here, just here.com slash seller, or slash shrubs. That's a good question. So, you need which subtree you're in? When you do... But does it verify, like, you know, does he need to know whether you're going to go back with this question? Whether you're in the, you know, right subtree here in the left subtree or whatever? So, the verifier only has to provide us a public input of the proof. The actual shrubs have to construct the frontier. And then, in zero-knowledge, you actually do the proof that your node is connected to that frontier. So, the verifier doesn't actually isn't able to infer where the tree is. So, coming back to performance, we implemented a few variants of different percletrees and benchmarked them for a tree height of 32 levels. And the basic implementation not surprisingly is slow requiring 1.8 million gas or every insert. It doesn't matter where in the tree you're inserting you're always going through the root of the tree and so it's always expensive. And in the optimized case we're able to shave a little bit of the time or gas. It ends up being around 1.7 million gas and then in shrubs in the worst case it's still 1.7 million. But in the best case it can be as low as 45,000 gas which is quite incredible. And then in the mean case on average it's just under 1,000. It's still pretty good. So, the next question is okay. So, if you were to use this data structure and if you were to implement the cache what would the actual cost of a transaction be entirely insolidity? And so we implemented the verification piece of that in Roth 16. And we did it up with these numbers. So, on average the commitment is around 100,000. Depending on how many inputs you have to your core operation so you can have one or two inputs you're going to have to add one or two entries to the nullifier set. So this would take around 26,000 or twice that to prevent double-spends on return. And then next you have to pre-process the public inputs. So in this case the public inputs are now these shrubs. So this actually costs more than before before you only had to put the Merkle route as the public input. But with EIP1108 this is going to be a lot cheaper with the next part fork and so it's still pretty reasonable around 300,000. And you have to do the actual verification which provides around four pairing operations I believe which adds another roughly 200,000. And so the total cost is around 500,000 depending on how many inputs you have again which is about 2x more expensive than Aztec but you get no linkability. So all of your recipients and sender addresses Now we also looked at what this would look like in Planck We didn't implement this like we did with Rautik scene but we did some estimates and the reason Planck is interesting is because the processing of the public inputs is a lot cheaper You don't have to do modular exponentiation so this is actually quite cheaper Unfortunately this ends up being a fair amount more expensive So the overall price of the gas is higher but this is still potentially interesting because as you increase the height of the tree you're going to have less gas added here versus here here this is unfortunately the height of the tree so you might end up paying quite a bit more as you increase the height of the tree So generally we're hopeful that with some optimizations we can lower this and potentially get to a point where it's even cheaper than using Rautik scene with an even higher tree and all of these things will be cheaper Cool, so I mentioned mobile What about it So the nice thing is by using MIMC the grouping times are quite short so that's quite elegant so you can construct these transactions in just a few seconds on a 200 dollar phone but unfortunately the cash is not that likeline friendly so it's hard to know which transactions are for you since they are encrypted you would have to try to decrypt every transaction in every block to find which transactions are actually coming to you so that's obviously hard to do as a likeline and it's also hard to know what your hidden input should be to construct those going up to the frontier a fill node can figure this out pretty easily because they see every transaction they can maintain the piece of the tree that they care about to construct those groups but likeline you wouldn't want to actually query a fill node to find those nodes because that would reveal which the commitments are on their own so this is still an unsolved problem and so one solution is to use operators and to introduce a concept of a decryption key the decryption key would allow your parties to decrypt all transactions that are for you without giving them the ability to actually review the transactions to see the balances and the senders so we're looking into actually doing that so that with just a limited amount of trust you can have a full node help you and then do these things and so that's it just wanted to thank you for your time and yeah, answer any questions so the size of the group is the size of the amount of gas that you have to pay not the size of the group but the amount of gas that you have to pay will be different depending on where you are in that limit tree and so one way of actually making this a little bit more uniform is to use a gas token so that everyone contributes more or less the same amount of gas and if you happen to be unlucky and you have to go all the way to the group then you can lean on that gas token more than other groups but is the question about the size of the group so it's not about the size of the group it's more about the size of the network so the number of transactions that you have to do to have the group and for you under the variable aspect in the country itself you always pass the fixed edge yeah the best look the height of the tree yeah, we can chat more on that yeah do you have any differences between working on the right or the right side I'm very sure but I'm not the right one so let's chat more it's slightly different it was inspired by working on the right side it's a slightly different concept where you always have the fixed size commitment or rather it's a little bit so in a Merkel tree in a Merkel tree is a cryptographic accumulator which gives you a constant size commitment to a vector of values and this commitment is the root of the tree with Schraub's it's also a cryptographic accumulator with a commitment size proportional to this height of the tree and it's determined by the edge of the of the tree the entire edge which you have the root the next element on the right the root for every complete sub tree yes and this edge is your commitment and you have to pass this commitment so like all the items on the edge to this mark but since we have a tree of fixed size we just pick a size of like 2 to the power of 32 this size of commitment will be also fixed which is just logarithm of this which is 32 which we can pass as inputs to the or we can also optimize to just compress them in a hash Thanks guys, I think that's time but we'll be outside if you have any more questions