 Thank you all for coming and thanks in particular to the volunteers who made this happen. I worked with some of it and they do a lot of work for this, so thanks everyone. So I'm Jason Mancuso. I'm a research scientist at Dropout Labs. We're a startup focusing on privacy-preserving machine learning, building tools, and doing research to solve that problem, which I'll be talking about today. I'll mainly be talking about why we should care, what people are doing about it, and what we could be doing better. And this is going to be a bit more of a technical talk in the sense that it'll introduce you to a lot of topics you might not be familiar with. And it does, to some extent, assume a basic familiarity with machine learning, but I'll try to explain as much as possible while going through it. So first, just some motivations for why we should care. So this is a paper recently that came out from Google Brain's group on Nicola Pepperno. He argues that when thinking about security and privacy and machine learning, we should be taking inspiration from traditional computer security and not reinventing the wheel just because we're doing AI. And so he builds on these principles that were published in the 70s or 80s, the Protection of Information in Computer Systems by Salter and Schroeder. And the security principles they dictate, roughly half of them dictate that in order to make computer systems secure, we need to prevent excessive information flow within these systems. So then without even accounting for adversarial machine learning and all the attacks that we've probably heard about throughout the conference, I can confidently make the claim that traditional machine learning is not SMS secure in this way because there is excessive information flow in the form of privacy leakage. So really, privacy is a security problem. But not only that, privacy leakage creates bottlenecks in the existing machine learning process. So here's what I mean by that. Machine learning is a normal workflow. Where on the left, you have some data source, some data generator, whether it's a public dataset that's been compiled, or it's a private dataset owned by, for example, a hospital or some corporation. And so traditionally, when we train machine learning models, we have to aggregate this data somehow in a central place. We engage in training the model, we apply some learning algorithm to it. Of course, hand waving away a lot of data science complexity, but once we're done with that, we produce this model and deploy it and put it into production. Whether we're actually serving users who are a separate party or serving some internal party within the same organization, there's always going to be some deployment phase where you're running predictions. I mean, this is a kind of, this doesn't fit every use case for machine learning, but it describes quite a few of them. And so one thing to notice is that there are multiple parties in the setup, but there is one point of attack that as an attacker, I only really need to compromise this middle party, the aggregator, to gain access to every asset in this workflow. So this is just to say that we do machine learning on private and sensitive data where there are concerns and machine learning does quite well. This is an example of skin cancer classification, but as a user on the right of perhaps taking a picture of my mole and uploading it to a cloud somewhere, I might not be incentivized to do this. Because if I do this, then I'm revealing whether I have cancer or not to a third party and I don't know how they're going to use that information. So this leads to the concept of bottlenecks coming from privacy leakage in the process. And there are multiple different points where these bottlenecks occur. So first is actually getting the data for training the machine learning model. If you're part of a security wear organization, a lot of times that can be challenging. Going through all these hoops to get access to data is definitely a bottleneck. And as a result, we can't actually partner with external organizations to pool data that should describe the same phenomenon to train better models. In addition, the central party who's doing the actual training takes on a lot of risk by aggregating the data in this way, training machine learning model, and then deploying it. Because, of course, aggregating the data, if you're not careful, can lead to data breaches. And then once you deploy the system, you can't be sure how that system is going to be used and who's going to query it, in which case there could be attacks that people can do to extract information from that service. So this could be in the form of model duplication or model theft where you've spent all this money creating this machine learning model and someone can come along and just do querying it, they can distill the knowledge that's been encoded in that machine learning model. But of course, you can also perform attacks on machine learning models to extract the training data. For example, model inversion does this, where you try to optimize, once you have a machine learning model, you can optimize for a specific case and then that will recreate the data, reconstruct the data point that was used in the training set. And I'll get into this a bit more as well. And so finally, the incentive problem that I mentioned before is a bottleneck as well, where if users aren't incentivized to use the service, it's not a good machine learning system, right? So a few possibilities for how we can solve this. I'll just cover these lightly and then dive into them a little more deeply. First is this concept of sanitization, where if we could scrub the data at each point whenever it changes hands by adding noise, any single data point in that data set will be obfuscated or masked, so that if I want to know something about a specific person, because of the additional noise, I can never be sure if what I'm seeing is their actual data or some obfuscation of that. Another possibility is that we can move some of these components to different parties. So instead of bringing the data to the model, maybe I can bring the model to the data and train the model locally on device. Another possibility is encryption. This would be really great, right? If we could just encrypt data and process that data while encrypted and still learn from it, then we'd resolve a lot of these privacy problems that we have. So then I'm now going to go into how these high level things that I've just mentioned are actually implemented in research and in production. And I'm going to do so based on these three privacy preserving primitives, differential privacy, federated learning, and secure computation. So first, differential privacy is related to this notion of sanitization that I talked about, whereby adding noise, you can obfuscate the actual data. So this is actually the definition of differential privacy. It is somewhat technical, so I'll try to explain it intuitively. So assume we have some data set and we're allowed to interactively run queries on that data set as long as those queries aren't give me a person's data, right? So what I'm going to do is I'm going to query the full data set, maybe run some statistic over it, and I'll get that. And that includes this person's data, Alice's data, and so that I can query that same database without Alice's data. And so what happens when I subtract these two results is that I end up just with Alice's data, right? So just preventing someone from accessing the single data point isn't enough to actually guarantee the privacy of that data point. So now with differential privacy, what we would do here is we would obfuscate each query with random noise. And then we construct that noise such that the expected difference between these two queries is always going to be as little as possible. So statistically, zero. Practice zero is not actually possible, so we use this EPLON number that's slightly greater than zero. So this is really cool because it allows us to actually interact with this data set. And any time we take subsets of it, we can be sure that that is statistically negligibly different than other subsets, which means that we can protect our users' privacy in this database. Oh, and that randomized mechanism can also include arbitrary pre- and post-processing, which means that it could be the forward pass of machine learning model, or it could be the full learning algorithm that produces weights. So this randomized mechanism is essentially just a black box that has some randomness in it. And as long as we can prove that that preserves privacy in this way, then we're good. Now the problem here is that there is a trade-off between privacy and utility. I use a very simple example where I have some set, and every time I query this set, I'm going to add a little bit of noise from a normal distribution. And then this is the actual expected privacy loss, or this is the privacy loss of doing this, for example, with averaging. So what then happens as I take this number n to infinity, as I make this outlier, as I turn this last number into an extreme outlier, what happens is that the privacy cost, even after we add this little bit of noise, also goes to infinity, which means that in order to bound this thing, we need to actually take the noise to infinity as well. But if we take that noise to infinity, then we're adding essentially arbitrarily more noise, and any statistic that we take from this set is just going to be meaningless because it's just completely noisy now. So this is a problem in general for databases, but for machine learning it's actually quite nice because in machine learning we care about general patterns. We want to pull general patterns from large data sets, and we don't care so much about outliers and unique points. We usually remove those if possible, at least for many applications. So what's nice about this approach is that we have a mathematically rigorous way of proving that our system is private in this way. And in this way, when I say that, this notion of privacy is actually very strong. It protects against a broad range of attacks, like membership inference, re-identification attacks, set-differenting, model inversion, these types of things that are in the literature. And it also gives us this intuitive notion of a privacy budget, this epsilon number that we talked about, how negligible different statistics are, or different predictions are, that becomes a privacy budget because if you query the same query a number of times, the randomness will eventually, because it's zero centered noise, will eventually target in on the right value. So we can say as a user you can only interact with this data set or this model so many times before you've maximized, you've outdone your privacy budget, and you can no longer interact with the system. So there's a lot of machine learning research going into this. Just from Google's group, there's DPSGD, which the idea here is that you can add the noise instead of adding it to the data, you can add it to the gradients that are produced during stochastic gradient descent while training your model. And then you can prove that this actually produces weights that are differentially private that don't leak anything about the underlying data set. Pate is another one. This uses model ensembles and distilling information from those ensembles in a way that the final model is differentially private. There's a lot of research going on into new definitions for this kind of thing, differential privacy, and then providing new proof techniques, and then also coming up with new mechanisms, new randomized mechanisms to improve the privacy utility tradeoff that I talked about. But some open questions. I mean, differential privacy is a very strong guarantee, but it's not everything. For example, if other users are contributing data to train a model, I don't need to actually contribute my data to that in order to experience privacy loss. People can just run that model once it's trained on me, and then I've experienced privacy loss that's contagious through those other people. We currently really don't have any solutions to this yet, but it is a very big open question. So differential privacy, it does solve some of these bottlenecks. So data access, you can scrub data at any point, you can scrub any component of the system with noise, and then prove with differential privacy that it preserves privacy. But the problem is that if you do this too much, if you add too much noise, the utility of the whole system kind of falls apart. So maybe the user is no longer incentivized to use your service, and it's pointless, right? So that moves us towards different primitives. One that I'll talk about now is federated learning, where we actually just take the model and train it on device. Google is really well known for this. They do this on thousands, millions of devices, and that's a very specific use case. When you do that, you have low availability of data owners or parties with low resources and then low bandwidth. So this is kind of a very tricky way of doing on device learning, but in general, you can imagine doing this between five or 10 hospitals in which case they can ramp up servers with GPUs and have strong network connections between them and always be available. So this is actually a viable avenue for training machine learning models in a non-standard way. So what's nice about this is that we don't centralize the raw data, which removes some of the excessive information flow I talked about. In particular, the data access problem, because you never have to centralize it. And perhaps users are incentivized to use your service if they know that you're only doing the predictions on device and they're never actually sending their data to you. But the problem is that our risk management is not actually that strong. And there's no guarantee that me, as the designer of the service, isn't actually going to peek at your data when I want to or need to. So we have no privacy guarantee like we had in differential privacy. And the other problem is that the model is exposed. When we send the model to training on these arbitrary data owners, we now have to treat them as adversaries because they could potentially try to influence that process in a way that's advantageous to them. Or assuming if you're training machine learning model amongst a bunch of parties, at some point, someone, one of those parties, is going to end up with a model that is essentially the same as the final model, which means this asset that this asset that you've trained is now up for grabs. So this is not an ideal approach, but it does offer maybe some additional privacy that we don't have in the current paradigm. Okay, so what if we could try to accomplish the same thing we're trying to do with federated learning, but add in some kind of privacy guarantee back without maybe adding so much noise that the service doesn't work anymore? And this is kind of the notion behind secure computation. This is research from the 80s. The general idea is how can we compute a publicly known function while the inputs and outputs of that function are kept private? And private here is usually defined similarly to how it is in standard encryption, like in communication encryption, for example. So again, just to reiterate that, we have this publicly known function and we want to evaluate it without knowing the inputs and outputs necessarily. And this function could be the forward pass of machine learning model, or it could also be the entire learning algorithm. So we can do training or inference in this way, if we can do so. And how we actually do it is we find some function that is homomorphic through F. And this just means that you can encrypt the inputs and run the function the same way and end up with an encrypted output that you can then decrypt. That's the bottom line. You have this encryptor and this decryptor. The encryptor is homomorphic through the function. The decryptor is cryptographically hard to determine unless you have some of the inputs to the function. And the rest of this talk, I'll refer to this as the homomorphic paradigm for machine learning because it's the dominant one. There are actually a lot of different ways to do this, secure computation. There are a lot of techniques and they all have different trade-offs. So the main takeaways of this is just that they're not directly comparable because they all have slightly different threat models. There is no silver bullet and actually most state-of-the-art protocols are using some combination of these different approaches. We'll focus in a bit on secret sharing just to demonstrate how this applies to machine learning. So for example, if you wanted to run inference on a simple linear model without exposing the data privacy, you could encrypt the inputs to the data. Here we call it secret sharing. What we're going to do is we're going to take a little bit of randomness and subtract it from the data point. That randomness is then, it's called a one-time pad. It's extremely secure. Now we have three components. We have the original data point and we have these two masked versions of it. So if I want to compute this forward pass, this linear regression with someone, I send them one of those shares. We essentially do a public multiplication with the weights and then a private addition between the components in this vector. This is just a simple dot product. And what it ends up doing is we can end up with an encrypted version of the final result. And so if we reconstruct, if we send the shares of that output back to each other, we can reveal the answer of the machine learning model's inference. And we can actually do this while securing the weights too. We can just treat the weights as another input to this function of a dot product and do it just like that. It gets a little more complicated and it gets slower, but it is possible. And so this matrix math is actually really conducive to secure computation, which is why this is actually a potential approach that we could do to do this and scale this. So once you move past simple dot products, you can actually scale this up to larger models and then do a lot of interesting things with it. So past just prediction, you can do training. This is an example of server aided training where all the data sources on the left, instead of aggregating their plain text data, they will encrypt the data and send it to these two servers. And the two servers now will engage in some computation with each other. They do a lot of communication back and forth to be able to do this. But in the end result, you end up with a shared machine learning model that is kept private, where the final party who wants to deploy this model never had access to the actual raw data. They only had access to an encrypted version of it. So what's nice here is that we're able to run machine learning models on encrypted data. What's bad about this in particular with secret sharing is that there's the possibility that these two servers will collude with each other to reveal the data and break the privacy guarantee. So there are secure computation techniques without that. That's just one particular one that's a problem. And so instead of centralizing the encrypted data on these two servers, you could think of just doing the multi-party, the secure computation between these parties directly and never centralizing any data. And you can do this just fine. It just gets slower. This research field is a long history. On the left here, we have mainly research papers that are applying secure computation to machine learning. But secure computation is a long history before machine learning was during the AI winters. And on the right, we have some general secure computation frameworks that people have implemented machine learning algorithms in on top of. And on the bottom, we have these two specialized projects that are actually instead of using these generic like kind of academic C++ libraries, they're trying to build privacy preserving machine learning primitives into the frameworks that people are familiar with in particular TensorFlow and PyTorch. And I'll talk a little bit about that more later. So some open questions around secure computation. As I mentioned a few times, it is extremely slow to do this. Right now, we can't actually scale this to, for example, a 152 layer neural network. We can do pretty well, but that's currently out of reach. So the question then becomes, can we ever account for that? Can we make that cost, that extra performance cost marginal as opposed to really central? But if not, I mean, secure computation for machine learning can still enable applications of machine learning that currently aren't possible today, just because of privacy concerns, for example, regulatory competitive or sensitivity issues. And so, but the question to maybe, or to answer this question about marginal cost, can we combine MPC with other approaches that are faster to maintain some of its privacy guarantee without totally ruining the speed of machine learning? And there's a good example that Google put out there called secure aggregation where they apply this to federated learning to give it some kind of privacy guarantee. And similar to what people are doing with differential privacy now, can we think of alternative definitions for privacy and security that allow us to perform this in a more performable way or to run machine learning in a more performable way while still preserving that notion of privacy as opposed to the strong information theoretic privacy that we see from secure computation. And then that's some of what I want to talk about. It's part of my research as well is that I'd like to answer this question. So just to summarize real quickly on secure computation, so what it does is it gives us very strong guarantees for accessing data and preventing unfettered access to data, which helps with the data access problem, with the risk management problem, and also with the incentive problem because you can do predictions in this way and have formal guarantees around the user's privacy. It does not actually prevent these differencing attacks and model inversion attacks that I talked about with differential privacy. But because differential privacy works, we can combine it with MPC or secure computation to actually get like a full privacy preserving workflow with full strong information theoretic guarantees for preserving privacy of the different parts of the system. Okay, so now that I've talked about the primitives and what people have been doing for the past 20 years, let's look a bit at current research, what's happening now, and how we should prioritize them. So big focus right now is on designing systems and tools that make this easier to use. This is a paper that we produced. It's doing secure computation in TensorFlow, which is of course the dominant deep learning framework, largely swallowing up more and more of the ecosystem as Google marches it out. And so what we were able to do is use the distributed computation engine in TensorFlow to perform secure computation through secret sharing. And we found that because they've put so much work into that distributed orchestration, it's actually a very convenient way of expressing these secure computations and it's extremely efficient. So this is on GitHub, it's open source, you can check it out. Google themselves has published a few co-libraries around the stuff. On the left, Google's federated learning team has produced TensorFlow federated, which is all about doing federated learning in TensorFlow. On the right, TensorFlow privacy, which is more around differential privacy and training differentially private models. So now these three primitives are each available inside of TensorFlow. If you're familiar with that, you can play with these. They're not totally integrated yet, but we're working on it. Another thing that we just did recently was a different kind of secure computation is called homework encryption, which is much closer to public private key cryptography, but just designing that crypto system so that you can still do the computations. Microsoft has a really well-known package called SEAL, it's highly optimized for doing this, and we were able to bridge TensorFlow with Microsoft SEAL so that you can do homework encryption as well as opposed to just multi-party computation. And then we've also experimented with deploying TensorFlow models in secure enclaves and trusted execution environments, in particular Intel SGX. And so we have a project out that actually that won a Google and Intel cloud, confidential cloud computing competition. So that's pretty cool to check out. And then there's another paper kind of doing a similar thing from a group or community called OpenMind. It's currently probably the largest community of people interested in privacy preserving machine learning. It's run by Andrew Trask, who is a research scientist at DeMind and a student at the University of Oxford. And this is essentially building these three primitives into a system that's integrated from the beginning, whereas these TensorFlow projects have kind of started off in different groups and then so that they exist, they're pretty performant but they're not well integrated. This project is more around having an ecosystem that's very highly integrated with itself. And so I was involved in the early days of this one too and it was recently featured as a free course on Udacity, sponsored by Facebook. And if you're interested in playing around with these primitives in a way that's really accessible, you can take this course free and it'll introduce you to these techniques using PyTorch, which is the system that we built in this paper. So beyond systems and tools, there are some larger questions that we still have to figure out as a community. The question of centralization is of course a big one. If we're going to centralize things, how much should we centralize and how should we centralize them? Unfortunately, this question has become quite heavily skewed with a lot of the blockchain hype train. So for example, SingularityNet, if you've seen Sophia the robot, they're all about this, combining blockchain with AI in a decentralized way and that can be distracting for sure. But one good thing about this is that with the blockchain hype has come this interest in economics and machine learning and their combination. So if everyone had control over their personal data through privacy preserving machine learning, it could be possible that data would turn into a liquid asset, which has pretty strong implications for how we do machine learning now and the notion of like a data economy or a data marketplace. This then makes you wonder how you might price a trained machine learning model or how you might value the underlying training set that was used to produce that model. So corporations estimate the cost and value of machine learning models all the time, but no one's really pricing data in this context. So that's kind of a nice open area. I was recently brought up in this ICML paper from 2019. It was from a group at Stanford. It's called Data Shapely, estimating the valuation of data. And what they found is that it's really interesting that data that's outliers are worth less than data that's representative of the pattern you're trying to learn, which is somewhat not surprising, but it's also really helpful because as I mentioned before, differential privacy is really strong unless you have these outliers. So what turns out to be valuable data lends itself well to differential privacy and the utility privacy trade-off that I mentioned before. This is super cool. I love this paper, but and it's really unfamiliar to me as like as a machine learning person because it focuses on estimating economic values and stuff, but I will come back to it in a bit. So moving on to secure computation, this is like with differential privacy and secure computation, these are probably the two main strong guarantee big primitives that we would really like to see all machine learning done in if it could be right because they're very secure, they're very strong. And the main problem is just that we can't scale them to all the applications that we see today. So we do have a kind of slight advantage in machine learning. I mean, so the timeline here is essentially that, you know, you have these general secure computation techniques and frameworks, and people are just optimizing those for generic computation. Then when people started applying it to machine learning, they realized that, oh, well, all these optimizations are actually really convenient for matrix math and then the stuff that machine learning uses. And then now actually it's going so so people were saying, all right, how can we mix and match these different protocols to improve the performance for machine learning models? But now we're going to the point where we're saying, well, how can we tweak the machine learning models to make it more efficient in this encrypted space? That's a lot of the work that we've done at Dropup. So for generic secure computation, it's unacceptable for a function to evaluate differently than it would have in the plain text. But for machine learning, we don't really care about specific examples, because we're already approximating some ideal function. So if it's better to just find some other approximation that runs more efficiently, we can do that as long as the performance is roughly the same. So we're seeing quick wins by just adapting machine learning, tweaking, for example, the activation functions we use or going from max pooling to average pooling or something like this in a deep neural network. And we're seeing big wins there. So then the question becomes, is there an upper bound on those quick wins? And so I tried to formalize this question. I phrased it as like this no free lunch hypothesis, essentially, just given like this, this machine learning algorithm, can you find a roughly equivalent version of it in the sense of runtime, but also in its security? Or sorry, in terms of runtime and in terms of generalization performance, but with the requirement that it can be computed securely? And the difference is negligible. So this is that in pictures. In theory, anything that you can compute in plain text should be computable in a secure way. But in practice, efficiency just constrains what we can actually do. But if this hypothesis were to be true, then that actually wouldn't be the case. There are always, at least for machine learning, there was all there would always be some slightly different machine learning model that runs as fast, but securely. So the problem with this is that the way we prove security in MPC does not lend itself well to this. This is an undecidable statement when you when you look at how we prove security. It's called the simulation paradigm, if you're interested in that. So maybe we need a new concept of security to be able to actually answer this question in a way that's helpful. So just to reiterate one thing here, with federated learning and secure computation, we have this threat model dilemma where the one that scales doesn't have a strong security guarantee. The one that doesn't scale does. And how do we reconcile this? I would propose what I am proposing is to analyze the incentives of everyone in this in this machine learning game, treated as an adversarial game where anyone could steal the other person's asset if it's worth their time. So if you knew the utility function of each party in this private machine learning game, instead of relying on computational hardness and information theory, you can rely on the fact that they're motivated and that you know their motivations and then you can use game theory to prove security. And so because you understand that's only because you can understand their motivations and also only because you can price the different components in this system, in particular data and models, if you can price those, you can do this. Now that's where the details lie and that's where we still need to figure things out. But this has a long history as well actually. It's called rational cryptography. Now it failed to take hold because it was trying to solve generic cryptography problems and utility functions are extremely hard to estimate in those generic cases. But as I've been saying in private machine learning, we only ever need to estimate the utility of those two things data and models or pieces of models. And we already have ways of doing this in both cases. So if we can do this, it could give us NPC protocols that are faster but still provably secure. And we can use this to drive security guarantees for federated learning in particular just to see like just how weak that guarantee is for a specific use case. So in conclusion, privacy is a security problem. That security problem creates leakages or bottlenecks rather in the in the current paradigm of machine learning. And these techniques that I've been talking about, differential privacy, federated learning and secure computation can alleviate those bottlenecks if you allow them to. The field is really wide open. There's not many people working on this, but it is a very exciting area to be in. And there are now packages and tools that you can pull down from GitHub and play around with this stuff on your local computer in a matter of minutes. To me, that's like a really strong recipe for people in security machine learning to experiment with and just to play with stuff. So thank you for your time. I've included some resources and we'll get these slides up so you can play around with it if you want. Thanks. Yeah, so actually, unfortunately, I would say, I mean, detection is hard. In particular, for federated learning, this is a huge problem because you're just sending the model out. They can do whatever they want. They can optimize it however they want. They can feed with whatever data they want. There are defenses, so this field is called Byzantine Resilient Machine Learning. It's still new as well. What's nice about secure computation, though, is that you can construct the protocol so that it's impossible for people to add computation to the function evaluation. So they can only ever follow the protocol in lockstep. And if they deviate at any point and try to modify the result in a way that's unanticipated, the protocol will just break. It will just not work. This is the difference between passive security and active security in the field. So there are some defenses, and even in federated learning, for example, at the aggregation step, where you aggregate the different parties updates, instead of just doing an average, you can do the median of those updates. And then the median is much more robust to these outliers. That would much more likely be data poisoning. Poison data would be much more likely to be an outlier relative to the distribution. So that's kind of the current trend around that. But to push the median, you have to go really far. You have to control a lot of the data. You can't just do it with one point. You have to do it with a large subset. So you can get resiliency up to a third or a half of the parties. But going past that is extremely difficult. So secure computation, that's not the case. You can do if there's n parties, n minus one of them can be corrupt, and you still cannot break security. That's kind of another nice feature of it. So the question was, why are some models more efficient than others in the encrypted space essentially? Well, the answer, I mean, it depends on which technique you're using. So there's this big list of them, this one. So that's actually only true for these three on the right, secure enclaves can handle arbitrary computation, and it's roughly as fast as regular. The problem there being that they're demonstrated side channel attacks on these enclaves. So they're not as secure potentially as the ones on the right. And the ones on the right have different operations that they're optimized for. So garbled circuits is really efficient for comparing items. So if you essentially build a Boolean circuit, and you scramble it in a way that's privacy preserving, and then you evaluate that circuit, which lends itself really well to taking maximums and stuff like that. But for like, matrix multiplication, for example, it's a lot harder to get down to a binary circuit with that. But so in garbled circuits, have some other problems. So most of the people right now are doing homework encryption or secret sharing. Those are really optimized for arithmetic. So addition, multiplication, but taking comparisons, for example, just doing a ReLU is actually really expensive with those because that doesn't fit into arithmetic. So there is like a theoretical understanding of why different things are faster. It's more just figuring out how you can stay within those confines, but still learn all the same things you can learn with like a normal machine learning model. For example, if you if you remove ReLU and replace it with something that's arithmetic based, can you still get the same learning capacity? And that's I mean, one of our interns is actually running some experiments around that. So that's Can you repeat that? No, in general, the function, the protocol of encrypting these things is known ahead of time. What's not known is that the randomness used in the specific mechanism. It's kind of the same way that public private key cryptography works. Yeah, just a little bit different setups. So if you get the key, however, then yes, you can break. So in homework encryption, that that is a hardness question in in MPC. It's a it's a collusion question. But yeah, cool. Anymore or no? Cool. Thanks.