 For those of you that aren't as familiar, I'd like to say that zero-knowledge proof started digital signatures as Ethereum is to Bitcoin. We're replacing a digital signature with any program, right? And the thesis of sort of ZKML is that the most interesting program actually to put in a zero-knowledge proof is a machine learning model. And the reason for that is that it makes it as though the model, the AI, is actually running on chains. So we're giving intelligence to Ethereum or whatever chain it's running on. And it's kind of like we're giving it eyes, ears, sensory organs with which to perceive the physical world, with which to make decisions about physical reality and the most important bits of physical reality to us, which are the humans, right? What our intent is. So it makes it possible for humans, not just field elements, to control their destiny on chain, to control digital assets. So I'm imagining that in the future, we'll be able to say, just like we talked to our smart speakers, hey, Ethereum, please transfer 10F to Althea, and that will work, right? Because some ensemble of machine learning models will be able to combine all the signals of my intent, all the data exhausts that I've given off in my life, and decide whether that is true or not. For a human, that exercise is trivial. If we've talked for 10 minutes, and I make that request two years from now, you'd be able to authenticate and send that message. The fact we can't do it yet with Ethereum is just a technical problem. And finally, you could think about it as a way to let smart contracts exercise judgment, right? Deal with any kind of ambiguous situation, decide if a contract is satisfied, decide if a news story says that a hurricane has hit the coast, that kind of thing. And this little chart here is just a reminder that both the input and the parameters we can choose either private or public. And all four of those squares are interesting. Everything's public. We can think about the news model, scalable oracle. If everything is private, maybe we're thinking about decentralized Kaggle or some kind of medical situation where, on the one hand, the model has to be kept secret for reasons and the patient data has to be kept secret. So what we've built is a program called Ezekiel. It is a tool to turn onyx model, which is an export that you can make from PyTorch or TensorFlow. So a baked machine learning model that's been trained into a zero-knowledge proof. We can prove and verify at the command line, export to a binary or smart contract or a wasm. We're adding new layers to it daily and it's enough for sort of small production models. The scale is increasing very quickly, sort of two to eight times a month depending on how much time we choose to spend on optimization as opposed to adding features. And I expect that we'll be able to do some really exciting kind of identity stuff within maybe six to 12 months, maybe sooner. And we've been building in the open since July. It's Apache 2.0 license and that's the repo. We would love any contributions from you if you find this interesting. So here's an example of the kind of thing we can do right now. If you like Python, this is very relaxing as opposed to writing circuits. You can define a forward model X, Y and Z or tensors there with a shape that's determined at runtime. And I can just sort of define somewhat arbitrary functions of those tensors, matrix multiplications, powers, non-linearities, compose them, and then the tool will translate that into a determinate quantization strategy, figure out the runtime shape of the tensors and translate that into something that can be run as a zero-knowledge proof. So the pieces we can do right now are sort of to describe what we're going to do, do a mock or a full proof, as the back end is all halo two. We can do a complete a proof and we can verify a proof. And then there's some sort of technical parameters. This is an applications talk, so I want to kind of give you a sense of what's possible and then we'll talk about what applications are cool. Right, so this is an example of computing the table from a very simple model that just has an input, a weight and a bias, a matrix multiplication and a relu. We can do a proof, looking at different back ends, proving back ends, and we can do a verification. The proof will construct a file, a little JSON file that contains the data of the proof that you can upload to whatever the verification thing is, and then you can run also at the command line a verifier to check that proof. And okay, so now we go back to applications. I always like to show a little bit of code so that we can think about kind of what is the, what is, what stimulate imagination. But I like to think about this as kind of allowing scalable automated oracles, right? So there's three stages to that. The first stage is ingesting some kind of signed data. Now it works fine with not signed data, but there's sort of an adversarial problem. So we need to somehow rate limit the amount of data that's coming in or make sure that it's been signed off on by like say a news organization or camera or something like that. Then we run a model, maybe it's a text model, maybe it's image classification. It makes a decision about what the data was. And finally, there's an on-chain verification that then feeds back into the sort of attestation loop and can be used in the next model. So interesting things in the sign ingestion. This is, this thing, this space is really in an interesting stage. So there are solutions. There's something called a SXG, SXG, which is a signed HTTP standard promulgated by Google kind of replacement to AMP. There's an NGINX plugin. There's a one-click button on Cloudflare. A lot of people promise to do it. Almost no one has. There's email. There's a few other people from Direct Park that have worked on the email that you can hear about from Mayush. There's images at the publisher. So the news organization, New York Times, etc., might sign off on their images so that other people can't say, oh, New York Times said this bad thing happened. And then there are third-party notaries, like TRS Notary lip protocol, that you can use to create signatures as sort of a semi-trusted third party. And they all use things that we can now verify. But I think we're going to need an industry to push like HTTPS to push everyone to sign their data. It's technically easy. It lets us compose things in zero-knowledge proofs, but people just don't bother to do it right now, although they've all promised to. So the second thing is the text models, classification, and what kind of models can we build. And that's sort of a scaling process. You describe our roadmap as kind of ontology recapitulates phylogeny, which is to say, we look at all the models that have been built over the history of machine learning, and then you can download them. You try to run your tool on it. It doesn't quite work because they're scaling problems or quantization problems, or because you haven't implemented all the nodes. And then you fix it, right? And then you repeat. So as I said, open source project would love you to join me on this rotation. And then we have to worry about scaling. And there's kind of five ions that we use for scaling. So optimization, I won't say much about that. I think you will talk about that a little bit in his talk. Aggregation, which is a tool for combining multiple proofs into one proof and checking it once. Recursion, which is being able to verify the last snark inside the new one. And that kind of lets you do something like doing a separate proof for each layer, right? That gives you another scalability tool. And between aggregation and recursion, what you're going to see is instead of memory constraint, which is what usually the problem is for the size of the models we can handle, it's just money constraint, right? How much computation would we like to spend? It has a lot of overhead, but it kind of eliminates the limits on how big a model. That's why I can make grandiose claims about the size of the model that we're going to get to. And then finally, fusion is kind of the strategy of once we have a higher level understanding of the intent of the programmer, the computational circuit that they're expressing in Python or whatever, it's easy to swap out more sophisticated zero-knowledge arguments instead of looking at the level of constraints. I look at the level of matrix multiplication or convolution, and you can use those arguments and make it invisible to the person who's created the machine learning model. The machine learning model doesn't change at all, but it kind of gets faster and faster in the back end. Right, so finally, on-chain verification. This is, as I think you've heard, severely constrained by the preco-piles we have available in Ethereum. And a lot of the strategies just getting it to fit into a BN128. So the way that it can be done now is you have, and probably the future, honestly, just because of the nature of zero-knowledge proofs, is you have a first stage, which is some sort of input, and then you have easy to make yet hard to verify proofs, sort of wide proofs. Those proofs are aggregated using an aggregation strategy, which might require a fairly hefty machine. So one example I did use, 450 gigabyte machine RAM, and then that produces it hard to make, but easy to verify proof that you can then be checked on Ethereum. And right now it costs about 600,000 gas to do that. But there's some things that we should hopefully improve that over time. Right, so putting all this together, what do we get? We get kind of this scalable on-chain data feeds. You can sort of think of it as, oh, Ethereum can read the news, or instead of having a network of nodes that make a decision, like in Chainlink, about something that's happened, and then vote on it. And then there's a complex crypto-economic process by which they penalize people who attested wrongly. You just have one person, they download the signed data from Bloomberg, they produce the zero-knowledge proof, they upload it, and now everyone can trust that information. So that enables a much more scalable firehose of data coming from off-chain from Web2 to on-chain. And it's my belief that ZKML will actually be table stakes for chains over the next sort of five to ten years that while delivering on this promise of blockchain to a mass audience means that we really need this kind of a human, not a field element owns all my assets. There's lots of solutions I've heard at this meeting towards that, and this is part of that. You can imagine, for example, when that account abstraction starts working, that part of the account abstraction check that you're doing is to submit a zero-knowledge proof of your identity, using this kind of 10,000 factor everything about me as one of the pieces. So ZKML oracles will be simpler, faster, and more scalable to put arbitrary off-chain data on-chain and just, I think, really opens the firehose to what we can do. And finally, as I said before, a ZKML model you can think of as a smart judge that interprets ambiguous events, and in the remaining two and a half minutes, let me just talk about a couple of examples, ideas of things that I hope people will build with this. So one that's sort of timely is ZKKYC, right? So we can take a person and an ID and prove that they match, and that the ID number is not in some sanctioned database, right? Of course, that's technically something we can do very soon. However, regulators won't accept it, right? Banks have a KYC rule. It says specifically that they have to know the customer, not that they have to know the customer is not on a sanction list. So it seems like it won't work. However, maybe if we had done that, it would have prevented the tornado sections, right? If you're a DeFi developer or you're a Mixer developer, you add that to your pipeline, it sort of makes you less of a target. It lets you run faster than your friend when the bear is chasing you, or at least lets it, does something to prevent unwanted actors from interacting with your contract. So even though it's not perfect, it's interesting. I talked about prediction markets. You could imagine setting up a contract that pays if a news story classifies to a particular thing. Someone won an election, a hurricane of a certain intensity hit a coast, a car received a lot of damage, and then a small classification model because it's relatively few classes can be used to decide whether that happened, and anyone can download the science story, run the model, submit the proof. Another thing is kind of fraud checks. So gut checks for smart contracts. You could imagine the abstracted account or the smart contract just has another little ZKML check that is a rate limiter and a fraud check and makes it harder for people to scan that contract, right? A brief humanity, for example, would work, but lots of other kind of fraud checks or checks of the state of the network. You could imagine getting baked in a little bit like civil protection, but it's okay if it's weak because it's just a layer of security. I think it's exciting to think about this as putting the A in DAO, so really taking a situation where humans make a judgment over what happened, vote, and then some multi-signatories actually execute that judgment, and actually putting that all with an effectively on-chain AI, right? So instead of working for a DAO together, we all work for the all-knowing AI in the cloud. Should be good or bad, but interesting. And, you know, I could use, for example, to test whether someone who promised to do some work for the DAO did a good job, did the work, right? You can imagine making a classifier that did that. You can think about using it for genetic screening situations where, for example, the model has to be secret because it's been trained on non-consented data. The data has to be secret, you know, just like when someone wants to get an STD test or something, they would like to choose whether or not to reveal it to anyone to maintain anonymity. So that's another application. And, finally, it composes well with not just MPC, but also things like differential privacy. You can imagine a data owner, and this says I can do it, blockchain necessarily, but you can imagine a data owner that wants to protect their data. They can release noisy summaries in a differential privacy sense and then receive, the querier can run the model and torture the data and send a final query, which can then be run in a proof to show the real model that's just the commitment. It was used to create the noisy marginal summary, and when it was run on the data, it produced the real result. Okay, so I think I'm out. Thank you. Thanks, Jason. Jason, if you want to take one or two questions, while Yi Sun gets set up, assuming he's here, here, hello. Yi Sun is going to talk about another aspect of DK Machine Learning, specifically in neural networks, but did I see a question over seven questions? Yep. Machine Learning models are often kind of already black boxes. How do you verify that the right Machine Learning model did the prediction or the verification? Yeah, you can commit to the model. So that's particularly easy when the model is presented as an onyx file. You just take the hash of the file, and that's your model, for example, or you can commit to the parameters themselves. So you make some kind of basically a hash of the parameters or hash of the whole thing. And that's how you know it's the model you want it to be. Do we have one more quick one? Everybody else can take them offline. Sorry. Really cool stuff. Has the CK KYC sort of being worked on? Is that a theoretical possible application of something like this? Or like, what's the state of? Yeah, I'd say it's absolutely being worked on. I think it will be a little while before it's feasible. But, you know, it's definitely something that I expect to see probably multiple people create in the next year.