 So I'm Noah Johnson. I'm co-founder and chief product officer at Oasis Labs And I'm gonna talk about privacy preserving smart contracts at scale So if you follow the news It seems like every article in the last three years or so has written about the huge potential of blockchain How blockchain can change the world how it can enable fundamentally new applications and unlock value in Different verticals like health care. Now if you compare this to where we are today It's clear. We're not there yet, right? What we have today are things like crypto kiddies Crypto kiddies are great, but they're not going to change the world And so at Oasis what we're focused on is the question of how do we get there? How do we build a computing platform that? allows developers to build these sorts of applications to fulfill exactly these promises and so we're building a smart contract platform that can support real-world applications and what makes our platform unique is that We're focused on both privacy and scalability together now by privacy I mean the ability to protect data and state of the smart contract as it runs on a decentralized network and For scalability we mean not just high TPS and low latency Which is what most projects talk about when when they discuss scalability What we mean specifically is the ability to scale to fundamentally more complex types of applications So we think these two things are absolutely necessary to support applications of the future So if we take a step back, there's a reason people are so excited about blockchain in the first place Blockchain has a number of really exciting properties like the fact that it's completely open and transparent Doesn't rely on any central party and of course through smart contracts We can automatically enforce agreements even with people that we don't otherwise trust but Blockchain by itself actually doesn't provide privacy So blockchain can provide anonymity in that I can have a wallet that's not linked to my identity But it doesn't protect the data and the computation that's running on the network In fact today's blockchain platforms are designed so that all data is public. This makes the architecture a lot simpler And so this is a huge limitation if you think about all the applications on the internet that could benefit from these properties And then you think about which of those would still work if You required that all of the data was publicly accessible to everyone in the world, right? The list gets pretty small and so this is I think one of the reasons we haven't seen many of these breakthrough applications yet So what could you do if you were able to provide privacy? Here's a list of some of the applications that people are already building or are excited to build on blockchain Things ranging from credit scoring to private escrow To you know games and private tokens So what do all these things have in common? Well, they all need privacy They all need the ability to protect data or protect state either because the data is sensitive For example, it's user data like in the case of health care Or you just need to be able to keep certain aspects of the application secret in order for the incentives to work out For example games I need to be able to keep the secret part of the game secret from the miners and the players to make sure that they Don't cheat So these are all really exciting applications But if you wanted to run these applications on today's platforms You wouldn't be able to and to illustrate that let's take a simple example of credit scoring So I could go out and write a smart contract that does credit scoring right I could implement this in solidity the contract could take in some data from various institutions and Output a model for predicting a credit score Now if I tried to take this and run it on today's platforms, I'd face a number of issues that make this completely impractical The first issue is that if the output of my smart contract is derived from sensitive input data Then I have to be very careful to make sure that the output doesn't leak anything about the inputs This might seem like this is intuitive if you're doing machine learning, of course, you're not leaking information It turns out actually that assumption is completely wrong And you need to employ solutions at the application layer to protect against information leakage in this way The other problem, of course, is that the workers that actually run the computation the workers that execute the smart contract Will have access to all of this really sensitive data So if I don't trust those workers if this data is really sensitive, they could trivially snoop on the data or leak the data So for that reason I wouldn't be able to run this on a platform where all the data is public and then finally of course This is a relatively complex application Everyone's familiar with the scalability challenges of today's platforms right now if I wanted to do anything even moderately Sophisticated in a smart contract. I would face really really poor performance and very high costs So that brings me to our solution so at Oasis We're building a new platform that tries to address all of these problems specifically we have an architecture that protects privacy both at the application layer as well as the platform layer and A new approach for scaling to more complex application workloads So I'll talk about each of these layers kind of independently. I'll describe how our solution works and then how you can use them as developers So let's start with the platform layer So how do we protect data that is running on a network of nodes whom we may not trust? So if you look at the model for smart contracts today the model for smart contracts Of course is smart contract is a piece of code that takes in some input running the code produces some state transition That's recorded back to the ledger and on platforms like Ethereum Of course all the inputs and outputs are public so the concept of Confidentiality preserving smart contract execution is that all of these inputs and outputs are encrypted That is they remain secret and nobody else in the world Except for the smart contract can actually view their contents and this isn't just about Encrypting the data the challenge is being able to still run computations on the data without allowing others to view the data and To protect data both during storage when the data when the state for example stored on the ledger and when the data is actually being computed on so it turns out there are a number of different technologies for Supporting confidentiality preserving smart contracts these range from the use of special hardware to cryptographic techniques like Secure multi-party computations your knowledge proofs and homomorphic encryption So each of this these techniques has different trade-offs in terms of the performance overhead in terms of the class of applications that they can Support and in terms of the security model that is where are the security guarantees coming from so there's no Single approach that is strictly better than all the others It depends on the application and the use case and so Oasis will Support all of these techniques and allow the developer to decide which one they want to use I'll talk briefly about secure hardware because it's the one that will support initially and have done the most work on so far So are people familiar with secure hardware? Have you heard of trusted hardware? Great So for those who aren't familiar secure hardware is a is a technology that's built into processors that allows code to construct what's called the secure enclave and The secure enclave can run computations and store data in a way where it's completely isolated from the rest of the Machine that is to say no other applications on the machine or the operating system or even the user of the machine is Actually able to view the data or modify the data in the secure enclave and this is enforced at the hardware level And so what this means is secure enclaves provide both integrity that is we have assurances that the computation result hasn't been tampered with and Confidentiality that is we know that no one can actually view the data as long as we construct this enclave carefully The other thing that secure hardware provides is the ability to generate a certificate that can prove To a remote party that doesn't have physical access to the machine that the code is running on correct hardware And that the contents of the enclave are correct so In our previous work we publish an academic paper that shows how to use secure hardware to Run smart contracts in an enclave to provide end-to-end privacy that paper is called a key And there's a lot of interesting details to get this to work. I encourage you to check out the paper We're also working with the Keystone project on a fully open-source design for secure hardware So this is based on the risk 5 micro architecture And this will be a design that anyone can audit anyone can verify and any chip maker can manufacture So in the meantime, we're using the trusted hardware technology that's available today Including Intel SGX, but in the future as new technologies become available. They'll be exposed to developers through the Oasis platform Tomorrow morning at 10 a.m. There's a workshop with with Intel and a few other companies that are using secure hardware So if you're interested in this subject, I encourage you to check out that workshop Okay, so how do you actually use this feature? So suppose we want to have an election through a smart contract so I'm going to implement a ballot as a smart contract and Typically today if you want to interact with a smart contract, right? You would have a DAP that includes a web 3 library and the DAP would call into web 3 that library would issue Transactions to a web 3 gateway like infura and those transactions would be relayed to the workers that would execute the transactions And of course in this model all of the transaction data and the state are public now This is a great design, but it's designed specifically under the model where all of this transaction data is public So if we want to support the ability to keep state and data secret We need to make extensions to this web 3 interface and that's exactly what we've done So we have an extension to web 3 called confidential web 3 and what this does is allow Callers locally to construct a secure channel to the contract and then send transactions to the contract in such a way where The contents of the transaction are encrypted so only the smart contract can actually decrypt them and view them So essentially we have an end-to-end encrypted channel to the smart contract So no other machines on the network not the gateway not the web 3 provider Not even the workers are actually able to see the contents and the payloads of these transactions And of course the smart contract state is encrypted with a different key That's known only to the smart contract so that users can't decrypt the state But they can decrypt for example the responses from the smart contract So the the library that we're releasing is going to be available as a JavaScript library that you can integrate in your dApps and the library handles all of these details under the hood establishing this channel and Generating secure transactions so that the interface for calling into confidential smart contracts is very similar to what you're used to today That is your code requires only very minimal changes in order to use the confidential version versus the existing version So I just described how Oasis can protect data at the platform layer from nodes and workers on the network This is essential of course for protecting privacy And supporting applications that need to protect data, but it's actually not enough for certain applications And in certain cases you actually need to do something at the application level as well And so to motivate this let's ignore blockchain for a second let's just talk about the general problem of There's some data scientists that wants to do something useful on sensitive data Maybe it's train a machine learning model So they would write some code that trains the model that code inputs sensitive training data and and outputs the model Now even if we don't release the model if all we allow is for people to query the model Right, that's the whole point of training a model in the first place is so that it can serve inferences It turns out even through these inferences that are generated by the model It's possible to leak original sensitive data From the training data set and there's actually recent work that shows that this happens Even if you're not intentionally trying to leak the data Even if this is a completely benign example, and it's a pretty straightforward machine learning pipeline It's possible for the model to inadvertently leak for example credit card numbers and social security numbers This was shown in this paper and so clearly if we want to move this to blockchain And we want to provide end-to-end privacy guarantees. We need to protect against these sorts of issues as well And this is actually a real-world problem that companies in industry face So in our previous work we collaborated with uber to help them solve a very similar problem Their setting was of course they collect, you know customer data about rides and destinations and so forth And the data is extremely valuable So they want to make it accessible to their data scientists internally to mine the data and and generate business insights On the other hand because the data sensitive they want to make sure that they're protecting the privacy of individuals And so the the big challenge here is how do you allow kind of safe analytics things that don't violate privacy? Well disallowing things that do violate privacy so a good example to contrast this is you know something safe is Asking a statistical query like how many people visited New York last year right clearly That's not gonna violate anyone's privacy Nobody in this data would object if this result was returned to the analyst on the other hand if the analyst asked this Sort of question. How many times did Joe go to Whole Foods last week? That's a statistical question on exactly the same data. This is a pretty serious privacy violation So how do you automatically? Guarantee that the system will allow things like this and disallow things like this. This is a very hard problem in general So one of the techniques that we've used is a technique that's developed out of academia and it's called differential privacy By the way, are people familiar with differential privacy? So fewer people great So differential privacy is a formal definition of what privacy means It's a formal mathematical definition and intuitively it says that okay if I have any two databases that have sensitive information and One of those databases has a user and the other database is identical to the first one except it doesn't have that user Right so database one has Joe database two has every record in database one except Joe Differential privacy the definition says that if I take any query or I train any machine learning model or I do anything on Both of these data sets independently Then the answers the results of those computations have to be essentially the same whether it's run on this database or this database So what this means is that from the result? There's no way to tell whether Joe was actually in the database because the result would look identical Even if he wasn't and so what this means is given that I can't even tell whether a user is in the database I certainly can't learn their sensitive information And so it means the outcome is the same with or without Joe's data And the property says that this has to hold for every user and every database And so the nice thing about this definition is it provides very strong guarantees for individuals while still allowing those sorts of General-purpose analytics and machine learning which is the very reason oftentimes we want to analyze data So if a user knows that the system enforces this property, they don't need to worry about what the query is doing They don't need to inspect the query. They have assurances that whatever is returned There's no way for the person that sees that result to know whether they're in the database So there's no reason for them to object on privacy grounds to being in the database in the first place So this is a very very nice property Our work actually was focused on how to take this property and build real-world systems to automatically enforce differential privacy and other types of security policies for real-world queries and in a real-world setting like Ubers And in particular allow it to be used by non-experts So hide all the complex details of how to actually get this to work so that Analysts and developers have a very simple interface They don't need to make major changes to the way they write their code and the system Automatically enforces differential privacy and can prove that it enforced differential privacy so we have a number of papers that talk about this work and the the system that we built based on this research is Actually now already deployed internally at uber so all queries that are submitted Internally by uber analysts are routed through the system to enforce properties like differential privacy and other policies around Compliance with GDPR and there are other pilots underway The tool itself is available open source. We want to put this in the hands of smart contract developers So this has already been battle tested in enterprise We want smart contract developers to also benefit from this technology so that any developer can use it Even if they're not privacy experts So we're going to release what we call the oasis privacy SDK that allows a developer to write a smart contract That includes some of these libraries and that means that the smart contract for example if it's training a machine learning model Even if it's training on sensitive data, there are provable guarantees that the output actually doesn't leak any information about inputs And so essentially what this means is that? Developers can demonstrate to users because smart contract code is auditable They can actually prove to users that they're not violating privacy the user can verify that their privacy is not being violated This is really an unprecedented property This is something that you don't have in traditional cloud applications the fact that the user doesn't need to take the developer's word for it They can prove to themselves that their privacy is not being leaked and we don't need terms of service We don't need privacy policies. Everything just happens automatically So for example if you wanted to build a data market You would be able to use this SDK build a machine learning Pipeline and have users that interact with that data market Know that there's no way even if you wanted to to violate their privacy So in the future as privacy becomes more and more important for users features like this I think will be a huge competitive advantage for applications that are moving to blockchain versus traditional cloud applications Okay, so that's a glimpse of some of the solutions at the application layer finally I want to talk briefly about scalability And in particular how we can scale to more complex types of workloads so scalability is a complex subject I won't go into full depth in this, but I'll kind of sketch out at least the intuition of some of our architecture so if we look at why scalability is important as These smart contract platforms continue to evolve people are building more and more complex applications and As the applications get more complex congestion on these networks increases So I'm sure everyone's familiar with crypto kitties when it became popular last year It you know it slowed down the entire Ethereum network and it caused transaction fees to go through the roof So if we kind of extrapolate to the future where what we want to build is not just the next crypto kitties But rather a new paradigm for data sharing and doing machine learning on sensitive data These applications are significantly more complex than anything that is being deployed today And that means we need a new architecture that it can accommodate these sorts of applications now I want to emphasize that Scalability is more than just high TPS So when when most platforms talk about scalability usually what they're referring to is the number of transactions that they can process in any given time period And this is an important metric, but it actually doesn't capture the full story There are a number of variables that determine, you know, how well a platform performs in practice And one of the most important of them is what does the workload look like how complex are the computations? There's a huge difference between a platform that can process, you know, thousands of very simple transactions like payments But completely fails to process anything that's even moderately more complex than that And so we architected in our system specifically to provide scalability for complex applications and those that have Dependence on other smart contracts. And so this is actually an architectural solution And to sketch why this is necessary if you look at the design for most existing platforms The reason these platforms don't work well for complex transactions is the fact that Every node actually runs the same computation so that means there's an enormous amount of redundant work and there's no parallelism Also, the fact that every node has to do both Execution that is to run the code as well as validating transactions and Maintaining the ledger this means that if there's a slow transaction like a crypto kiddies breeding transaction That has to run and it will prevent even faster transactions that could have confirmed immediately if the machine was able to validate it Those have to wait for the computation of the slower transaction to finish So this means fundamentally that slow transactions will cause congestion for faster transactions And this is true even for typical scaling solutions like sharding a simple sharding solution Simply partitions the network into multiple groups of nodes that all have the same property So even with simple sharding you still have this congestion issue for transactions, you know within the same shard So we need a fundamentally new approach One of the main intuitions behind our architecture is to separate out the computation That is, you know the machines that actually run the code separate that from the machines that validate transactions and manage the ledger and so One of the you know one of the challenges of course is how do you do this in a scalable way? How do you allow these nodes to verify that there's transaction is correct? Even if they can't see the transaction data right earlier I showed the concept of confidentiality preserving smart contracts is that all of the outputs are secret So these nodes have to be able to verify that the transaction result is correct even if the data is encrypted So there's a lot of details like that that need to be worked out But fundamentally the advantage of this architecture is that these layers can be scaled and optimized completely independently So if there's a bottleneck because of compute for example because there's a burst of transactions that require more computing power then it's very easy to scale up just the computing side of the network and Conversely if there's congestion during consensus because there are too many transactions that need to be confirmed in any given time period You can add more workers easily to the consensus pool. And so you have a lot of flexibility For for you know performance optimizations and scalability. And so this is most beneficial for really any kind of complex application like machine learning and Applications with mass adoption that required this sort of flexible scaling that you don't have with these bundled architectures today And we'll publish a paper soon that describes this architecture in more detail Okay, so let's put this all together and show how all these features Work together to solve real-world problems and hopefully get you excited about the sorts of applications that you can build on this platform So I'll talk briefly about an application. That's currently being developed on Oasis, which is called Cara This is a collaboration between Berkeley eth Zurich and Stanford And so Cara is designed to solve this problem of data silos around medical data So medical data is very sensitive, but it's also very valuable if researchers had better access to medical data They could develop better systems for personalized health and predicting, you know health conditions and so forth. So Cara implements a data market specifically around medical data So the way it works is a user inputs their data into a smart contract and A researcher that wants to train a model on that data can submit the model to the smart contract and pay a token or You know pay in terms of tokens in order to run that computation The entire computation runs directly in the smart contract and the users receive compensation for their data and the researchers Receive the trained model so this is ideal because on the one hand users right because this is implemented as a smart contract They don't need to trust researchers. They don't need to vet the researchers They don't need to worry about exactly what the researchers are doing They have assurances that their data will only be used for this specific purpose of training a model And because they can verify actually that the model enforces differential privacy They know that while their data is going to be used for this analytics the output actually won't reveal anything about them individually So it's completely safe for them Researchers on the other hand now have access to richer and broader data sets than they had access to before Because users are more willing to contribute their data And the fact that the whole thing runs on a wasis means that even the workers that are running the machine learning Actually can't see the data So we have end-to-end privacy guarantees for users and their data and of course, you know You need scalability you need scalability to these complex types of computations in order to be able to run all of this directly on chain Great so hopefully get this gets you thinking about the sorts of things that you can do remember earlier the slide I talked about unleashing the value of Medical data, you know, this is exactly an example where blockchain can can solve new problems and bring value to places where Right now the value is locked up specifically because of privacy concerns So we want to support and empower developers building on a wasis to develop these sorts of exciting applications We're releasing a number of tools and libraries that support, you know Diverse applications ranging from machine learning to data markets to games And even standard kind of Ethereum style smart contracts We also have a startup hub which I'll discuss briefly to provide more direct support to teams that need it And we want to build a strong community together. So we announced a few weeks ago this startup hub You can apply at the URL at the bottom This will give teams that are working on really complex applications and want a little more support and Mentorship directly from the wasis team this will give direct access to a wasis engineers and and Access to our our list of top investors as well So if you're working on an exciting application Please apply here. We'd love to have you in the program And then finally so we launched our private test net a few months ago. We've gotten a ton of excitement There's a lot of developers already building on it based on the feedback from these developers And our experience operating the test net. We're really excited to announce the next milestone Here today and we're calling it a dev net because the main goal is to allow developers to start building these sorts of privacy preserving applications So what it will include is the confidentiality libraries that I described earlier these extensions to web 3 for providing You know new APIs to protect data during computation Provide a contract kit to allow you to develop and debug contracts locally. We support EVM as well as Web assembly so that means you can develop smart contracts in rust and we have a set of tools for doing that We're backwards compatible with Ethereum if you have an existing smart contract and you want to port it over to a wasis You can do so without any changes to the smart contract and your app immediately benefits from you know The scalability and privacy properties provided by a wasis We have a set of resources and tutorials Including best practices for writing private smart contracts how to build you know secret voting apps for example How to build a game in order to protect the state of the game and a number of other tutorials so that will be available at launch And then finally just better reliability and performance and we'll continue to improve the performance as we implement more of our scalability architecture Into the test net so we're really excited about this This will be available very very soon So if you're interested, I encourage you to go to oasis labs comm We there's still be an application process, but we've streamlined the process You should hear back from us probably within 24 hours and you can immediately start building your application So that they'll be ready to go by the time we launch our main net So if you want to build an application you can go here if you're excited about this project or about confidentiality and secure hardware and machine learning in general We'd love to work with you either as researchers collaborators as well as team members So we're hiring you can go to this URL To see the job openings So that's it if you want to find out more go to oasis labs comm and follow us on Twitter at oasis labs Thank you We have time for one question Yes, I just wondered with your privacy preserving smart contracts If the inputs and outputs are encrypted and I can't see it run How does consensus work on a blockchain? Right, so we'll describe that in the paper that we'll publish about this new architecture Conceptually the nodes that validate transactions the transactions still run on multiple workers The encrypted results are sent back and you can have the same sort of Validation process that you would have in other systems where you compare the results from multiple independent workers And if there's any discrepancy in those results Then you know that one of the workers actually produced the wrong result and there's ways of resolving that and punishing those workers So intuitively that's how it works. There's a lot more details and those will all be in the paper. Good question I think we're out of time. I'm gonna hang out here for a little while if you have other questions Yeah, this room is empty. So if you guys have questions for them, you can just come up and talk in person. Thank you guys