 So, I think let's get started then. I'll start then, shall I? All right. Good morning, everyone, and welcome to this inaugural Hyperledger TechVest event. I'm Julian Gordon. I'm the VP for Asia Pacific for Hyperledger. And I'm speaking from sunny Hong Kong today. And I'm delighted to be invited to open this event. Firstly, I'd like to thank the Hyperledger India chapter, African chapter, and the APAC meetups for organizing the series. This is really a great show of collaboration. I think one of the first we've done across multiple different chapters. So great, great collaboration. So what is the Hyperledger TechVest? As I said in one of the LinkedIn messages promoting this event, one of the questions I always get, and many people in the community get asked, how does one here get to hear and meet the contributors and maintainers of Hyperledger projects? Well, this is one of these events. Today you'll hear and see four contributors or maintainers of Hyperledger projects. And those, these are the people that write the code. And without them, Hyperledger, Hyperledger is all about writing code. Hyperledger would not exist. This is the first of three events, and you'll hear from many other contributors over the coming weeks. It's such a great idea. There is so much great knowledge and experience to be shared from these contributors and maintainers. And I'm looking forward myself to listening and interacting today. And please do interact. Ask questions. We have a Q&A channel. We have a chat channel. Please do ask questions. We want to make this, you know, interactive as well. So why the name TechFest? Well, Arun, who you're going to be hearing from soon, who created this event was keen to have it during the festival season here in India. So he came up with the term TechFest. So please do enjoy and to celebrate this festival season on this Hyperledger TechFest. By the way, if you don't know Arun, he is the co-lead of the Indian chapter, and also this month got voted to the Global Hyperledger Technical Steering Committee. So congratulations Arun on that. The Technical Steering Committee, or the TSC as it's commonly known, is called the heart and soul of the Hyperledger project. And it is made up of contributors, like some of the contributors who are here today, who are elected once a year, and they are elected by the contributors of the project. That's what we call a duocracy. Those who get involved are those that run the projects. And you can listen to the Technical Steering Committee, and actually you can listen to any of the different projects, their meetings, and it's a great way to get the feel and pulse of Hyperledger. So I really recommend listening to the Technical Steering Committee calls. The second question I often get asked is how do I contribute, how does one contribute to Hyperledger? I would encourage everyone listening on this call, on this webinar today, to get involved in the Hyperledger community. Contribution can come in many different forms. You can contribute to projects and you're going to be hearing about a number of projects today, and these projects are always eager to get more contributors. You can help organize events like this. You can help run the chapter. I talked to Arun before, and he's looking for more and more people to help here, specifically in the Indian chapter. The chapters are looking for people. We need help with translation. We've done, I think, again, fabric, we have now many nine different languages, and some of that is done here in India. We have Malayam, we have Chinese, we have Spanish, we have many, many languages, but we're always trying to make, you know, what we do here at Hyperledger more accessible. So translation is something we're very keen on writing documentation. You know, very much a requirement, and obviously people who want to use technology, you know, the better the documentation, the more accessible it is. So basically any activity that helps build our support, our shared ecosystem with a Hyperledger. So please do contribute. A great way to start is actually to join, and I discussed this with Arun just before, right, the Hyperledger India chapter calls. They happen every Thursday afternoon. You can find out the times and the calling numbers on the Hyperledger wiki. So that's wiki.hyperledger.org, and then drop down in groups and you'll see the Hyperledger India, and actually maybe we can put that link in the chat today. So please do, or not just in the chat, but anyone, just reach out to any of us, any of these contributors, anyone involved in Hyperledger. Everyone is very, very open to helping people come and help contribute to this ecosystem, to Hyperledger. So that was just a quick few words. First, I would like to say thank you for all attending. Stay safe and enjoy this event. Okay, over to Arun. Thank you. Thank you, Julian. That was a great introduction. And yes, we welcome all sorts of contributions from India. And even if you're a developer or program managers or project maintenance, then please do feel free to join the weekly calls. I just shared the link on chat for all of your reference. And you will also find this public information and Hyperledger's public calendars events. And to start with today's event, we welcome you again, warmly, and we will be starting with Mr. Gopinath, who is a senior scientist at NIC, Government of India. And then he will be speaking about how to write a transaction processor, how to write a smart contract in Hyperledger so to. And we'll now switch over to Mr. Gopinath. Thank you, Arun. Can I just do my presentation? Yeah. Yes. So as Arun told, I work at some senior scientist in National Informatics Centre, Government of India. It is a technology arm for Government of India. So I work in a network division where we run a country-wide national knowledge network. So that is something like Internet 2 in America. So some similar kind of network we have. In fact, we pair with Internet 2 also. So basically I am in that area. So, I mean, in my group, we are all around us. So we work with software, we work with that people and we share knowledge, everything and make it. So I have been doing system administration also quite some time back and knows about the importance of the various, the verticals in ICT, especially in Government scenario. Working. So I am doing my higher education. So I am doing my masters. So in my masters, I have to do a project work. The final semester is all about a project work. So, in fact, you can see the logo here. In my presentation, it is not of NIC. I put the logo of my institute, BSA Crescent Institute of Science and Technology. So the project. So I am interested. I have always been wondering, say because the logs are very important for me. Whenever the router exchanges the route information, it has to be logged. Whenever somebody logs in my email system, there is a log. When the mail goes from X to Z, there must be a log and the law enforcing agencies may ask for a log. The log is a very critical thing, very important thing for various reasons. So how to make that immutable? I mean, how to secure the logs? How can I prevent anyone can have in ICT? I mean, I am basically an Unix guy. So I know log into route and then go to that where logs is then you open any log file and delete some file. I mean, it is very easy to tamper it. So how to make that this thing? Why not I use blockchain for that part? So the complete investigation on this aspect and made me to do a project in this particular area. So my project title is trustable logs using blockchain. I am not the first person to invent these things, but this idea is natural to me because I am in that field. So the problem statement is something like this. Events are recorded as logs. Logs are used for various purposes. Regulatory authorities requires that most of the legacy systems are tamperable. I can use warm device if I want to make reasonably a trustable thing, but it is very costlier and I can't do that in every each and every case. Is there any solution using software like blockchain? Yes, because the other name for blockchain is basically immutability. These are the two phases I have done. The first one is basically a steady thing and phase two is an implementation thing. So the next thing is since it is an academic project, there is always an intention that I should learn something new. So the Rust is a beautiful language I am told. So I wanted to learn Rust. When I wanted to learn Rust and do this part, then I have been searching which blockchain platform supports Rust as an SDK. Then naturally the sawtooth also comes. I never even thought what are the complications or how easy it is or how tough it is for the sawtooth. I just selected that part. Assuming that at some point of the time I am going to succeed in that part and the people are going to help me in that part. So I become a member in Hyperledger Group and I have been exchanging the messages. I have been reading the documentation. Then I get to know the weekly meetings are happening and there I got the contact of Haroon, etc. and then the things grown. So my implementation is a very simple one. It's a four node blockchain cluster and I use AWS because anywhere I can go there I can just access that part. And these are the literatures I require to refer. There are already a lot of literature on how to build a trustable blockchain. For example, the first prop, the paper tried to implement using Exonam. So again it is a PBFT system. So other people also making it and the one beautiful paper is Vitalik Buterin's white paper which I really learned something about. I'm trying to learn something about the blockchain. And another beautiful the paper, the introduction to sawtooth the PBFT in the blog, wonderful, beautiful plain English, simple English and beautifully explained it. So these are the literatures I referred during my project. So what is the architecture of my thing? Let me understand. I mean, put it like this. So I have four blockchain nodes and the logs are basically generated by the systems, the various systems. It could be an email system, it could be a router, or it could be a banking transaction or any system for that matter. That is something called log sources. Then I have a build what is called a log network. I built a log network. What I do is I send the logs to all the blockchain nodes in real time, in real time. For example, if the log is generated immediately that is sent to all the blockchain nodes so that the blockchain nodes knows that this event is really happening. It is in real time it is all sending it. Then I got a separate blockchain network, PBFT network I have all the four nodes are there and I have a PBFT network is separate network. Logically, it is a separate it can be residing in the same IP subnet, but logically, I mean for understanding purpose I keep it separate one. So like this. So what happens it is all getting accumulated. Then the owner of the log source. He generates a transaction. For example, let us take for instance, I'm trying to record the mail log. So let the file name is slash pair slash log slash mail log. So whenever there is a mail that whenever there is an event happening the logs are emitted on these logs are all transported in real time to the blockchain nodes. And then after some time at the end of the day or whatever it is, then the owner of the mail that the person who is in charge of that particular resource mail log log. So that person generates a transaction and then he fires to one of the nodes. Then the PBFT network takes care of everything. And then the transaction, I mean, because all the nodes. I mean, the transaction process is able to access the logs which are generated already. Then it can compare whether I'm telling the transaction which I'm generating at this point of time is correct or wrong. Then if it is so then it is pushed on to the blockchain and it grows like that. So this is the whole setup I have. So later up some point of time an auditor comes and he wants to verify the mail logs and log. Then he asked the owner of the log. Give me your transfer. I mean, give me your log file. So he takes the log file. Then he uses the same kind of structure and he queries the blockchain node. So this is the log from start byte is 0, end byte is say 1024. What is the hash of that part? If that hash matches the blockchain hash, then that particular thing is said to be not tampered with. Then he queries for what is the, I mean, from 1025 byte to 2047th byte. So what is the thing? So like that you can query and that part and then you can verify on each block of the log. You can just verify and if it's satisfied, that's fine. It goes forward that part. So here I can have something called a partial trust establishment. Suppose if the mail log or the speed log or the proxy log generates in a few tons of megabytes. So if tampering has been done at only one or one particular point, I don't need to throw the entire thing. I can just mean it is basically it is a byte boundary system. So which are the things are all is intact that alone I can take it back and then I can do that part. So here after the logs are all generated, what I can do is the blockchain nodes need not store the real logs. The only the hashes can be stored. So you can the purge the other stuff. I mean the original log stuff so you can store only the hashes. So there is no the space constraint also in this part. So I got to the client module and transaction processor module and verification module. So the client on behalf of Laxos as I told the lock stream is there. The transaction batches and then it sends the sort of the validator. The transaction processor business logic runs in every node and transaction process semantic just a minute. I think I'm chosen the wrong PowerPoint just a minute, just a minute. So what exactly I'm storing about the lock and this is my payload. So I have I mean I have this is the payload I store some ID. For example, if it is a mail log of this particular thing I give some ID it is a basically a 32 bit integer. It is there. Then I take the start bite and invite. So I got to implementation. So in one implementation, let us take that the second implementation it is a 64 bit start bite and then bite. And what is the data size data hash value short 256. So on this payload is C bar serialized using C bar. So what is the programming model of a sort of. So we got what is something called validator that is a blockchain node, and within the validator system, the transaction processor is running this transaction processor is the application specific, which means that every application developer should implement this transaction processor. So the transaction processor. I mean, it basically validates transaction semantics, and all business logics are applied over there. And that in the validator the blockchain node the validator contains what is called the global state. So the clients, the person or, I mean, or the systems who are going to submit the transactions through rest API, they are going to get contact the validator submitted. Because they want to retrieve some kind of state, they can retrieve that part. So this is thick. I mean, 40,000 feet. The I view off. What is the I mean, what this all about the sawtooth at a very high level. So inside the blog I mean the sawtooth. The validator we have something called there is a block. This is a basically a diagram. These are the transaction process and this transaction process is we have to write it and we have to place it. This is basically the core of the validator. So it contains a global state it contains blocks. It contains consensus engine proxy, which means the consensus engine is pluggable. For example, in my case, it is PBFT. And there is a block and the transfer TP handler transaction, process handler, everything is there and this is the network part, which makes the full mesh connectivity with other nodes. So what I do is, I read the file blocks, suppose I'm given the log stream for me the it is a basically log is a stream of bytes. So which means it could be that I don't say that the law. It should be in with the CRL for something like that. So it could be even a binary file, it could be even a binary log. So for me, the logs are the events are recorded in a stream of bytes, zero to bite. It starts from a byte a to start with the that is a log. So I read the screen. And I generate the payload. How do I generate the payload? I take, I give the ID for that particular screen. Take the start by invite calculate hash that is my payload. And I then I do I just push the payload. So after certain count, I generate into a transaction set of transaction and set of transactions are then pushed to the batch. And then finally, I push it to the call the rest API and submit to the one of the nodes. So this is another thing I have payload, and I got the various serialization. This thing. So the payload is represented by either C bar C bar or protobov or CSV, whatever format, then the transaction generation, I will do for that I need a key. So the transaction I create, then the batch header I have to create, then push the transaction into batches, then I make a submission. So this figure is something that the entire thing that the whole chain I put it, but it is not really visible. I could not able to make that. Because I was, I mean, I thought that I can do some better presentation, but last three days or four days completely support from there. So I could not able to even enlarge or make it I did not even touch that. What is a transaction processor modules, it is runs in every sort of nodes, suppose I write an application in a sort of application, all the business logic, I put it in a transaction processor. So this transaction browser, I need to run in all the blockchain nodes. Suppose I have the four nodes, all the four nodes I have to run the same transaction browser module multiple transaction browser can run at the same time. A blockchain node can contain can support various application at the same time and multiple transaction I can turn at the same time. And it uses a get set methods to set the state with the payload and to read the state I can use the get method and then make it. So the transaction process module has to implement an apply method that is something called an apply method, which the validator calls. So what happens is whenever a validator node receives a transaction from any client. And depending upon its family name, family version, it selects, I mean, which transaction browser it belongs to. And accordingly, it calls that particular apply method, and then the business logic runs in that part. Depending upon the business outcome, either your transaction whatever we write that transaction browser set the state or get the state. So this is the basic overall idea of what is a transaction process model. So the same is depicted over here, this is the transaction processor which I have here. And first, whenever a transaction browser when I start run the first thing is, I have to say hello to validator I have to do something called registration. I have to tell, I am the transaction processor. So I'm listening on this particular family name and family version. It is something like opening a TCP IP socket something like that. So I'm listening on that particular part. So what the validator node knows that part. So whenever it receives a transaction of that particular part. And after doing some kind of a syntax validation and it is okay, then it just calls the particular apply method or the part and then make it up. So basically you have an apply method the validator send TP process request and the trade object. The transaction conscious. These are masked as a parameter to the apply method. So that when the apply method returns and it can take all these things basically the transactions are sent over here. So accordingly, it can do this processing. And then it can get us in the method. So the transaction. So it depends upon I have to set the family name, family version and what is the name space, etc. So, so this is what I got it basically. So, I mean, the various verification, a lot of print line statements, a lot of things are there. So, because of my conditions right now I stopped here. So kindly excuse me for my poor presentation today because of my situation. So probably next time, I think I hope to do the better stuff. I stop here. Thank you. We have a couple of questions if you don't mind answering them. So the first question is, where are the locks actually stored. The locks are initially the locks are also stored in the blockchain nodes. When it is get transaction is when the when I fire the transaction of that particular thing and then it gets into recording the block, then the logs are not necessary to be stored in the blockchain node. So I can always the but the log owner need to store it in some safe storages to produce to the auditor at later point of time. But the transaction processor requires because how the transaction processor requires to know what is really the log, then only they can come into consensus, and then they can record the hash. So for that only initially they require the log at real time, a story temporarily for some time, our moment the transactions are over then I can throw off those logs in the blockchain node. Thank you. And one more question. So, how long did it take for you to build the such a system, including testing, etc. No, I started somewhere in, I mean, the February FIB last week, where I'm started interacting and all. So initially, I mean, I suffered a lot suffered a lot in the sense that reading documentation, the rust itself a challenge for me, because I want to do in really in rust. So I have chosen that part and really the things are not bent on well. So if I can remember one thing that in one night I just wrote to around somewhere in first week of March or second week of March that I want to switch out to Python and really then you send some sample files. So, just few side insight on that what I got a clue in that part, then I started writing. So I think, practically, if you take the first week of March or second week of March is really started. And initially, I just completed in the April itself I completed it. So thing is that I started using like a Bitcoin style because any blockchain, any person who want to read about blockchain. The thing is that naturally one has one will follow about the Bitcoin, the Bitcoin, everything that in the blocks that is no something called there is no emphasis on states, that is no addressable states and all. So I started doing right. In fact, I'm started storing everything in this, you know, in one particular state, then for verification I started digging the blocks and then taking out in one Dan Milton video I saw that one, he specifically told no, no, no, this is not at all. In video he told that this is not a blockchain you have to use states. Then my God, then I have to rewrite my program then I went to the second style the programming then that took about because there is no much of it only thing is I have to manipulate the addresses. So some reading material is required. So then that took about 15 days or so, not much of time. I think three months time, maximum three months time I believe. But the polishing and fine tuning it takes a lot of time that's the I'm not counting it, but the core stuff I think 200 months to three months I think I did. And thanks to our own. He answered a lot of questions through email. Thank you. Thank you, sir. Thank you again. So we'll move on to our next presentation next topic for today is about how do we, how do we scale up our fabric network how do we make it like 10 1000 times much faster and to speak about these kind of to answer these kind of questions. We have a team joining us from United States, and I would like to welcome Mike and this thing. Hi Michael. So Michael is a CEO of Prasada, and I would hand over to Michael for further session. Good morning everybody and thank you very much for inviting us and allowing us to participate in here. So is my screen being shared. There's always first question. Yes. Okay. And we'll just go to slide show. All right. Again, thank you. And yes, we do have the team on so let me make a quick introduction. My name is Michael Holman I'm founder and CEO of Prasada. Saga is a blockchain technology company. We are have designed and are building an open permissionless trustless chain with a token based entity. And through the process of doing some things we, we have run into some understanding or enlightenment about blockchain scaling issues and such. And that's what we're going to talk about here specifically one of the technologies that have come out of our research and design phases. So, in our, in the process of looking at how to scale a blockchain one of the things that came into account was the fact that we couldn't see a way to truly scale because of the inability to to actually move state from parallel change from shard to shard when they started to come up with the sharding technology. And the things that we found here and it's a challenge as we say in chain code but the same in smart contracts. They're static they're not dynamic versioning or updating chain smart contracts or chain code is difficult. We compare it to a smart contract to every time you want to write a word document you have to write the word application, you basically have to copy a smart contract, paste it into an editor, edit it and then upload it so you're re uploading the program every single time. They're not easily created or updated. It can take depending on the complexity yes you can do a simple smart contract in order to do a transfer of a token from one wallet to another. But when you're starting to get into complex supply chain and such like that it is not as easily done. It can take multiple FTE years to create and deploy code required requires reloading for each new contract, and it doesn't have a global repository. So, what we have is the extensible blockchain object model. Now a little history about extensible blockchain object model. It is, we call it a decentralized global operating system and the first time we've integrated it with a blockchain is on hyper legend. We'll go back to the where it started from, and that was, it's based on a messing message passing architecture which was developed by Xerox labs, the park. That's a Palo Alto return research. And it was the message passing architecture architecture called small talk was introduced in the early 70s. Yes, and it's been used as an underlying architecture for the leading global operating systems. You might remember. And we compare a lot smart contracts to an MS DOS MS DOS where you can run one application at a time, and then the event or the introduction of windows and windows is a common object model a class tree. It's a modified version of the first class object model. First class object model was actually also utilized by IBM in a system object model and another organization called gold corporation in their pinpoint OS, which was a tablet based operating system. organization that uses the first class object model for modification modified first class object model would be Mac OS 10. It was used by Steve Jobs, when he had next and started next OS. So history on on, you know what the first class object model is is there. Dave beaver men will go further into this. He's kind of going to do the highlights and then Dave is going to go into he's the co founder and CTO he's going to go deeper into the actual tech is going to show what how classes and objects work within hyperledger and also do a brief code review or loading that show how it loads into hyperledger. And what it is is basically an object oriented blockchain. Now we did have one individual Casey Tam, who is a hyperledger instructor. He was gracious enough to do it take time to do a review of the development environment and testnet that we have set up and available now. He is MVP. This is not production ready yet, but it does allow you to go in and start to use object model use the X bomb and test it out on hyperledger. The interesting thing that this does and I highlighted it theoretically the developer does not care about the blockchain as it is well handled by the infrastructure. So, to go into here, the advantages of having an accessible blockchain object model. What makes go to market 10x faster 100x faster. The reason is because everybody is referencing the same exact class or copy of code you're not rewriting code and each time and uploading it into the contracts. It's it's fully dynamic. It's instantiating referencing code from a class tree into your account. It's real time updates there's no service interruptions. If somebody sends you a transaction, and you do not have the proper code in order to execute that transaction or whichever part of that transaction you're supposed to, then your account will look up the class tree and the code that's necessary for you to execute that transaction and if you have all the proper credentials, it will instantiate it into your account on hyperledger. It creates a repository of reusable code throughout all nodes. It makes many to many relationships feasible on it on a single chain and what we mean by this in supply chain. We find that the issue is not when you're at say General Motors at the top of the chain but it's when you're the bolt the bolt manufacturer down at the lower in the tree in your supply train tree. And that's going to what X bomb allows you to do is, it doesn't care if it's an IBM blockchain platform, Oracle blockchain platform, SAP or any others. If you're on AWS, the X bomb just knows that you're a hyperledger fabric blockchain, and so it will integrate or implement with that. What we have developed is a blockchain application framework on the business side with this looks at is everything from layer three, four, five and six layer five in this instance is hyperledger is unknown or not does not have to be managed at all by the application developer. The application developer always plays up in layer two, and the applications are just the applications where the user interface will or with the user interfaces with the blockchain applications can cover, you know, whether it be pharmaceuticals oil and gas aviation. The list goes on. What the user is going to do and I'm going to turn them over here in a few in a minute or two is he's going to go deeper into how we're actually integrated into X bomb and go into the how we're integrated into the peer notes. One of the things that we had to make sure when we did when we develop this is that we don't do any changes to the hyperledger source or any of the other sources that's utilized in order to make a successful hyperledger fabric network. And the other thing that they will do is go through and show you exactly how code loads dynamically. You don't have to add more chain code you don't ever have to add new channels. You can dynamically add new classes to a class tree. And whenever those classes are necessary or the code within those those class objects are necessary, the chain your account automatically instantiates a class object in your account. So, I will go ahead and stop sharing. I will let Dave carry on. If anybody has any questions on this part of it. Please do ask. I will be back to talk about the finish up the last part after Dave is done. Good morning. Share screen. Share. Okay, good morning. Can everybody see my screen I assume yes. So, a couple quick things about the extensible blockchain object model. Another way to think about this is instead of object oriented programming and object oriented blockchain. The I'll go into what why how we do that but the core piece to recognize is that for any object in any object oriented environment. It needs a reference to it in a programming environment you have essentially a pointer managed by the programming runtime environment whether it's from C++ to Java to C sharp to Python to you name the language. Actually, even going even though it doesn't do inheritance does the same thing. What we recognized is that a interesting part of the blockchain is because it's immutable, you can have completely unique references object references stored on the blockchain. So the advantage of that you'll see how that makes it's called a first class object model in the first class object models go all the way back to small talk and before that small talk being the language in the designer, the author of it who invented the term object oriented programming. And we're basing that all the way back in that technology which is all message passing with what's called late dynamic binding or very late dynamic binding. So that's about how we put it on to the hyperledger fabric. Technically we probably could put this on hyperledger sawtooth as well, although we started on the path of fabric first and we're focused on getting a proof of concept out. On the left you'll see a traditional. I use the word traditional because that's how hyperledger fabric works. I don't mean that as a negative thing is that this is how it actually works. In the blockchain. We actually use the test network from the fabric samples and you'll see that later. And then you load chain code code. A little redundant, but, and then you run on it in the transactions are sent to the train code in the chain code interacts with the database in the state database layer in the blockchain consensus layer manage that. We don't change any of the hyperledger fabric code to make this work. The difference is that for us the chain code layer loads the class manager infrastructure that enables the class manager environment or classes and objects. Above that you dynamically load class code on into your runtime environment. We actually use what's called the external chain code model for hyperledger fabric. And that's as opposed to having the hyperledger fabric build the source and a docker for you. You, it's a part of version 2.0 and beyond. On top of that we have class application framework in ordinary objects in accounts, walk through them very quickly similarly. So in the standard environment, the first part your hyperledger fabric is managed by peer nodes and orders. The test network we actually just use a simple one it runs two nodes and one order the order. Going to believe it's currently doing raft I don't remember how it's configured but it's transparent to us. The chain code layer is literally load as a peer using it we're using all command line from the peer node command at the moment, but obviously you can send strings as well from an API. The peer node loads the chain code as a for an external chain code you actually send out a tar zip file that's configured appropriately. You can read that in the online tutorial for hyperledger fabric. A very important point I think we skipped so far, and this just reminded me our current implementation all of the classes are written in the go programming language. In there loaded as shared objects, which are known as goal goaling plugins, and there's a specific command to enable doing that with go. So what's very important here is we're not actually at runtime. We're not actually compiling the code. You're literally loading a pre compiled code so it only needs to be load once per node, when it runs a transaction once a class is loaded. It gets it can stay in the environment so you don't really have an overhead for loading it. In the application framework, I will show you the foundation classes and we have a diagram of it, and we will just do a very quick command line load to show that we actually call them the bootstrap classes. You have the ability to create accounts accounts can contain ordinary objects and similar an account is an object instance. Every account has one root object called the account object, and then everything else considered containment underneath it. So you have a containment tree for each account. We use that model superimposed on top of this environment. This is part of the bootstrap classes. It is actually possible to think of other models but in the blockchain world accounts are the common construct for everybody. A little bit about what a first class object model means, it's relevant very, very relevant for what how we made this work. In the first class object model, everything are actual objects they exist at runtime. This means that the class that creates the object. So is class object creates ordinary objects in a meta class, excuse me, a meta class object makes classes, which make a reference to each one of these. Because the blockchain is completely immutable. It's now rather, rather trivial to make global references to every single object whether it's a meta class a class or an object. And of course everybody's inheritable so you can inherit from a meta class or a class you can inherit from ordinary object. This is a key thing because it means that once you've established any object or any class, or any meta class that references usable anywhere at any time bar with with respect to any sort of scoping rules like if you're not allowed to send a message for into somebody else's account that's a thing, but as far as the reference itself is concerned, it's just another object reference. The basic structure, once you've loaded the class manager infrastructure loads the root class object so everything this is a single inheritance, not a multiple inheritance, although could be done with multiple inheritance. So single inheritance with the ability to create interfaces, the root inherited the root object is called class object. The class class is the root meta class in class dictionary maintains a string mapping between names in what we call ledger object IDs. What I mean by that is we create, like I just said the way you just pointed out here the independent object IDs which we actually do a chain hash to create using the standard sh a 256 algorithm in the, in order to map back and forth we call those ledger object IDs that you can actually work from strings. As you'll see farther on there, we return the 256 hash as a string of integer numbers. The bootstrap classes look like. So the for to enable dynamic loading of new plugins on any node, which you need any class implement, sorry, all class implementations are plugins. So when you, if you send a transaction to a node that hasn't seen that object or that class tree yet needs to be able to load, the class class file loader is what's used to go loading new plugin in and then establish at the classes there. The technically the class instance data is on the blockchain already this is to load the code to run the class class X bomb is now the root of all future classes. So just like class objects for all classes class X bomb is for everything related to the X bomb environment. Going down the tree just a little bit more. The class X bomb container is what implements the concept of containment parent child relationship. So this is literally a list of children with any, any child can be another X bomb container or subclass and have more children. This is identical in concept to an actual window tree that you use every day and we're using right now. The only difference, of course, this is not a user interface. It's just a straight object oriented containment environment. The class the class account which is a root for accounts is of type class X bomb container which means that it can contain objects. I have a picture of that further down on the left side is the part that uses what's known as a standard design pattern called a factory design pattern, and of course we named class factory not very original in a class account manager inherits from class factory which means an instance of class account manager will have the ability to create new objects in the objects that you that it would create would be accounts. So you would expect that the class account manager will create instances of class account or a subclass and that's exactly what happens. A little bit more on the bootstrap cycle. We created all those classes. And of course there's a little bit of bootstrap class class file loader loads classes, but you have to actually load class class file loader first be able to load classes so that piece is actually done manually, and then you can use it to load all the other classes which we do. And then instantiate to initial objects, the root dictionary, which is an instance of class dictionary to give you a way to have a bootstrap to find initial ledger object IDs in the root account, which is what you need to use to be able to send anything. All all messages sent to all objects you are meant to you have to send the message to an account to the object in the account with the method name that you're calling your message name in the input arguments for it. So you need those two things to get running. And you'll see that shortly. This is what the whole thing looks like. So at this point you're ready to actually now create an account. I'm going to go do this in a minute. And this is what an account would look like. So here's a root account object. And this, I'm sorry, this is the root of an account I'm using the wrong word root account says an account object, this could be a subclass, this could be an instance of class account as it shows, or instance of the subclass of class account. It supports because again it inherits from the X bomb container is perfectly possible to have any number of objects within that account in any object that also inherits containment also can be another object, another child list. It's important to note that it was hard to diagram it correctly, but because the object, the ledger object IDs are not typed, they're all you can send a message to any object, whether or not the object is of the right type or not. If you send a message an object doesn't recognize it will simply return not implemented means that all lists are heterogeneous lists of objects. This is different than you'll see in most object oriented environments which are trying to do that type of type checking at compile time. We don't do that there is some type checking we do for to make the message passing easier, but at runtime it's actually possible to send any message to any object. Dynamic binding has some interesting capabilities that enable a different type of object or programming that you might be used to for example if you write in Java code. Here's what a transaction looks like. This we use whether it's a couch DB or level DB, which is the hyper ledger database. Unfortunately, I'm not familiar with hyper ledger sawtooth database, but I assume it's similar runs a keyword key value for us the keyword is always a ledger object ID ledger object ID is the 256 sh a 256 value. In the instance data, the data is actually the instance data of the object instance. Now remember that classes and metaclasses are objects to. So the instance data for the class, the instance data for the meta class are stored in here as well. We take advantage of that. Right. We manage all of the inheritance in the pulling the instance data off of the blockchain where at runtime and a transaction processing anything that's changed automatically put back on the blockchain. If you don't make any change in instance data then it doesn't make any update to it. And because of the automatic replication of the database, you don't need to do anything else at that point. So going down the very straight flow transaction start you're actually sending from a client, the count object target object. Again, you, you retrieve the ledger object IDs I'll show you how that's done. The method name of the method you're calling on the object in the input arguments, the class major infrastructure does first uses the object ledger object ID, the LOID key to look up the object, make it walks up the class tree to make sure that all the classes are there, pulls all of the instance data out so you now have a running object. So I'll have this transparently the method gets dispatched to it. The method does inheritance so that if the method is inherited from one of the ancestor classes, it will call the appropriate one, and then you can actually do ancestor calls you can solve, you can say call ancestral walk right up the tree. Executes the code, which can include by the way calling any other object. You're in your environment that you're legally allowed to so that the transaction can consist of multiple changes to multiple objects instance data. When you when the transaction complete successfully. The data is written back to the database automatically. As long as there's no error returned the hyper ledger fabric will distribute to the other nodes automatically in the standard way. So all of that happens automatically for you. Just a quick piece of that all sounds great how do we do this to write a class in go, and I should make a note that theoretically you could actually write in multiple languages, and even in your inheritance tree you could have one class and play implemented in one language and an ancestor class implemented another language, we will actually build something like that at some point we've done in the past. This was to get a proof of concept about as fast as possible. You write your implementation piece in just class name go the, these are of course conventional names you don't have to exactly that it's just easier when you're looking at your code in your class implementation file maintains data type definitions. A method table which is used from message name to method resolution, the implementations which are the actual code that's called executed in some boilerplate that's required to load and run the class. The stub file is what's created, and you create these things yourself in the moment alone the future we would expect to create some of this with auto code generation. The stub file is what other classes used to compile against to send messages to your two instances of your class in the skeleton files what's received. So the class major infrastructure expects a stub file or skeleton file, the class major infrastructure will call into the skeleton file with the method to call and the method will act the skeleton file will actually then call the correct method. So I'm going to use code my hand at some at some point like I said we will automate it. You can see an instance of this in our open source. If I don't forget, if I, because I'll probably forget to say it x bomb.io is our website where you can register and then go pull down the open source for this. What I'm going to do here instead of going to this piece. Again, we're trying to keep this short. I'm going to actually bring up and do a quick example of this loading. You'll give me one second. I have to start up my we run on a remote desktop on AWS on one second. Here's the share for this. I get a stop should start sharing right here we go screen and go. Okay you should be able to see my screen. What I've done up to this point I'm using straight. This is actually in modified test network script so I'm actually in the test network of the fabric samples if you install hyperledger fabric to a quick. Demonstration of that. The original shell script, which was here networked at SH. We've modified slightly to class manager at SH. What we've done so far is just loaded up the you can just see appear I've just loaded up the blockchain itself. So I have two peer nodes in an order or running. So, and you can see this as Docker containers. Running. I'll come back in a second we, we did not change any of the source code for hyperledger fabric whatsoever. We did need to do is build our own Docker images because the default peer nodes for hyperledger fabric are built with Linux images that do not dynamically do not enable dynamic loading of shared objects, technically in Linux that Linux that means that you need what's called the LD dash Linux so library. And so we did is simply change the root Linux kernel that's built into the Docker images. Unfortunately, it makes the image larger than the very small ones that are done for hyperledger right now but it doesn't change any of the source code any the functionality of hyperledger. So now what we're going to do is we're going to run our bootstrap. And because we want to make the ledger object easy to see in the minute deterministic rather than random. We have a flag, which I've highlighted called L O ID. Let me paste it over here. That gives you a, you can put a random value in there every time you restart the hyperledger blockchain so you'd always have a completely unique numbers within at any given chain. For test purposes we wanted to be deterministic so I can run scripts so this forces a default value that's all. This is doing here's our bootstrap so this is now creating all the bootstrap classes. I have a nice, very simple return message all objects saved there that means that all of the classes which are objects that are for the bootstrap environment have now been stored into the blockchains that all the nodes can use it means to do this standard thing sent out all the transactions. To the nodes, and that's what the piece does here and sign them, then sent it back out this peer capability does standard invoke all this by the way we're using the the as you can see the chain code invoke command from the command line for this. Everything standard here. Next one would be now I'm going to actually get the root dictionary and this is a built in function because you need to bootstrap from somewhere in this that's what this does. And this will return the ledger object ID for my root dictionary ledger object ID, we return as printable numbers, but this is of course a sh a 256. You could put this as hex, you know, I can print this out in hex as well, just having printing out as integers in expected as integers, but you could just easily change that. So I'm going to go get the system account object which is the object we were talking about. A minute ago, in the trees of showing earlier in the ledger object ID for that you need to go collect those two starting point ledger object IDs to do anything. I now have something that I can send a message to task things of. And so now I'm going to send the first message messages are objects send. And there's something also called object call which is internal to the runtime but object send is, you're saying send this message. Here's the, actually, it's easier to see it in the black I think let me bring that over there. And this is, as I said the style of making a call. There's a couple more steps will be there. If I forgot you can't see the other window sorry. So in this environment, we have class manager invoke the arguments here's object send. Here's the ledger object ID for the account. Right here, all the way across. Here's a ledger object ID of the object we're sending the message to. And here's the method string the name of the method we're sending which is class dictionary, get object by name. And here's the parameter for the input arguments we're looking for the root account manager, which is an object instance that's been created to now be able to ask it to go make a new account for me. I'm going to go send that message in return. And when it's done right here is now returned to me, the object handle of the object I want which is root account manager. So I now have an object that I can now send a message to the message I'm going to send that into one second is to create a new account. Here it comes. And what this does so here's again here's another message. Here's and here's the account. Object. Here's the target object right here and you'll notice that this is the same as right here which was returned. So I'm literally I got the object and I'm going to send a message to it. And I'm asking it to do because I know that it inherits from class factory. I'm saying, make a new object. That's a message that is implemented in a class factory. It's a little inherited in the argument it takes is a, how many should I make. And of course I only want to make one at the moment. And it is now made an object. This is the handle returned again ledger of JD. If you notice a little unique piece right here. There's a double bracket. This is because this would actually return an array of objects if I asked it to make multiple so I could actually is a class factory but I haven't make accounts right here so I could say to the account manager make me a whole bunch of accounts in this account type. And one more piece. So I now have an object. And the object is an account doesn't actually contain anything is very short demo. But obviously I could add new objects to it. The interesting thing is, how do I know what the methods are what the classes are an object inherits from and what the methods are for those objects. So what we added in is the ability for inspection and return back Jason schema so you know how to send messages to it. And so the message that I want to demo for that is simply one that called. And this implemented in the root class object so all classes have it all objects have it rather class object inspect, which is going to return looks like a very ugly piece of text. But in fact, this is if you pretty print this you will see that this is standard Jason schema that it walks up the entire class tree and gives you all of the methods in all of the arguments in Jason syntax that you can use to send to every single class that the objects inherit from so I can interrogate any object, whether it's an object a class or a class and retrieve this information. So it gives you a very dynamic way to manage that you could build dynamic user interfaces from it and things like that. That's as far as I want to go. I'm going to go ahead and get X bomb.io. When you register on that it will give you the link to the. We call hlfx bomb public phyper ledger fabric X bomb public get lab, and you can pull that down there's a script to set up your environment in a Linux environment we don't do windows, and it will come up and run for you. That's the same environment I use. Thank you very much. I do see there's a lot of questions. I'll try to pull some up and look at them. Let me stop sharing. And Mike, if you want to take over, I'll take a look at the messages. Okay. Thank you very much Dave. For that. I am going to share one last time. Can you see my screen slide show from current. Okay. There can everyone see my screen. Yes, and hear me. Okay. Okay, so the last thing that there was one question that came in I saw on is this open source. So there's two parts to this for the developer for the developer community. Okay, so you want to understand this is the class code in order to build classes and, and offer. You can commercialize anything. Those classes, those can use vertical specific class trees that you make whatever figure those are your applications yes that is open source. We don't expect any royalties any revenue or anything from anything you create there. The source is what the CMI the class manager infrastructure and that will be a SAS license to the enterprise. This is not something that we're looking for to the developers to bear any costs on. We do have a channel pot also set up so that there is channel opportunities in order to participate with the organization. Again the class manager infrastructure is a monthly SAS license to the enterprise, the class build code is fully open source with no need to add to the project that your, your whatever you build is not open source that is fully commercialized for you. Okay, so next steps. Starting we are looking for proof of concept now we are working with a couple of different organizations here in the United States. We're looking for application and projects and, you know, and the team to kind of work with document to prove a concept objectives and then work with us on a POC design and testing. We understand that it's a new way to look at the architecture of blockchain and such so we're definitely here to assist in in any way that we can. So, as Dave said ex bomb dot io, you can go to ex bomb dot io and, oops, and register for access into our get lab repository where you can download the code and play with it we have on ex bomb dot io, the overview page towards the bottom are three so deep into the overview the functional overview of ex bomb on hyper ledger goes into the concept of accounts and class or classes and objects, and then finally into the JSON environment that Dave built and, you know, telegram per saga official. We also have the telegram chat and I believe this presentation will be available after to everybody so you can use the links there. We do have currently support for developers is to our telegram developer channel which is the. Yeah hi. Go ahead. Yeah, Michael. So, thank you Michael. Very important question regarding the ex bomb and the, I think you also may be aware in a DML and the kind of a court project so. Another person asking about how you will relate your ex bomb with the digital asset and the that account for the because they also put some kind of extraction at the 10 code level writing. I, you know, I'm sorry I did not hear that question can you ask one more Dave did you. I just did. I guess there's some network issues that come with a question. Yeah. Yeah, there is a question on the q amp a portal which they're asking how can we relate x bomb versus other solutions like what we've seen dm. Dave I'll let you answer that. So I'm not sure what you're. Oh, is that damels or you're talking about. Yeah. Oh, okay. So the key difference for the ex bomb environment for virtually anything else you've seen in any sort of smart contract or chain code or any other name for blockchain is that the class code in in, or in this case actually the reference to the class is actually stored on the blockchain so a class isn't exists as an object itself. And so the instance data to a class includes telling you what the message methods that it'll take our how to call the code, and in the environment would go it also tells you where to go find the actual plugins but if it was in something where you're running an interpreter language it would go to the interpreter code itself. When you do an instantiate an object, it will walk up the class tree, and those are literally looking up the blockchain for each class which is an object for reference to another class of its inheritance tree. So you only ever if you're adding for example new capability, you're adding just a subclass you're not recompiling the entire environment to do a single image. So that's a different concept from. Yeah, yeah. Okay, go ahead. So, thank you Michael and David for the your detailed technical presentation is really nice and primitive. So, and now we have a next presentation from something. This is to talk about private data collection and if you don't know the central from the research and contributed to the hyperlegia fabric and he also maintains his blog. He really writes about the hyperlegia fabric and it's concept in detail, and who started hyperlegia fabric following then he must be, you know, sent in, send it over to you. Can you. I don't send this there. Hey, come on, I don't see something on the panel. I think. I'm joining. Yeah, he just. Hi, Kamlesh. I think my zoom got crashed. Okay. So, I just introduced you. Take forward. presentation. Yeah. Thanks Kamlesh for the introduction. So, this talk is about the private data collection and hyperlegia fabric and this is available since version one or two. And this was anyway done in collaboration with other fabric maintainer money Shethi, David, you're out and you're going to. So, this is the agenda of the talk. So, first I will present the transaction flow involving just the public data. Then I will present motivation for providing data privacy within the channel. Then I will explain how the private data transaction flow works in a fabric, including private data dissemination and how to define private data collection for a chain code. And definitely there would be a questions like how channel and private data compared to each other and what are the pros and cons, when one should use channel and when one should use private data collection so I will talk about that. Then I will dive into some of the implementation details, so more like an internal to show that how the code execute when there is a public data and how code execute when there is a private data, especially for the commit path. Then I will talk about some of the performance numbers with private data collection and some of the optimizations that we have been doing and some we are planning to do it in the future to improve the numbers. And finally, some ongoing work. Okay, let's look at public data transaction flow. So I'm assuming that the audience are aware of smart contract endorsement and then state DB, some background on hyperlegia fabric. So here in this setup I have four organization A, B, C and D, and each of them is hosting and peer, and then there is a state database. And just for an example, there is a smart contract called DS that is hosted on all the organization, and we have clients, and we also have an ordering service. And in the right hand side it's just a committed only peer which does not run any smart contract. So let's say that organization B wants to submit a transaction. Basically it is trying to put store a value like account ID and then balance for that account ID. And depending on the endorsement policy for the smart contract, it actually stores the, I mean it summits the transaction to the organizations. Then the transaction is simulated on the respective organization. So the simulation results as well as organization digital signature is sent back to the client. So the simulation results include many things including the transaction message submitted by the client. But for us the key thing is the right set. So they also include the read set but for simplicity I have assumed that the smart contract is not doing any read. As a response to the simulation, it just give the right set. Right set include the account ID and then the balance number. And digital signature is nothing but the endorsement here. Then organization, I mean the client collects enough endorsement and then submit the simulation results and all endorse the signature for ordering. And then order perform consensus to create a block of transaction. Then the block is delivered to all the members connected to the channel. Then the block is opened and then for each transaction first the endorsement policy validator is invoked to validate against the smart contract endorsement policy. So if you remember the client connected signatures right so that need to match the endorsement policy defined for the smart contract. So if that is valid then it goes into MVCC validation. It's basically a validation to ensure serializability isolation. In simple we can think of it like a double spend detector and if it detects something like that it will invalidate the transaction. Basically what it does is there is a read set and then there is a version for each key. It basically checks any of the read has been modified or not. It's a simple right if I have read some data and then decided to make some modification. But before the commit if the red data has been changed then obviously I cannot commit the data because the logic might give me a different output if I execute the program. So that's why that's what the MVCC validation does. If it is valid then the transaction would be committed and then the transaction status would be sent back to the client. So this is the basic how the public data work in Hyperledge fabric. So in this setup there is a need for data privacy. Let's take a sample blockchain solution that is supply chain network especially here I'm taking a food supply chain network. Assume that there is a chocolate manufacturer. Then for that I mean if I take the overall supply chain network there has to be a farmer who is delivering the raw materials. Then obviously there will be a wholesale seller because the manufacturer may not directly buy from the farmers because the wholesaler take care of contacting multiple farmers. Then there could be multiple distributors and then different retailers. So let's say that let's for example let's take a single farmer supplier manufacturer distributor retailer and they form a consortium and then start a blockchain network in Hyperledge fabric. So each of them could run a peer. And so we need not to go into the individual detail like what are the actions that can be possible. So these all these actions are recorded in the blockchain. And the key thing is in Hyperledge fabric as we saw when a transaction is added to a block the block is replicated to all the members so every member get to see the data. So especially if you see invoice and proof of payment these are like very sensitive data. And for example if the invoice sent to manufacturer by the supplier is visible to the distributor then distributor can kind of calculate the profit or even the farmer can do the same. So those data we cannot directly store it. And one simple solution is to only store the hash. So just store the hash of the invoice or hash of the proof of payment. And then we need to manage an external store and then keep the actual data and then worry about the synchronization between the data that is present in the blockchain and then data that is present in the offline store that add a lot of additional overheads. And even the smart contract cannot directly access the data that is present outside the peer. So the new solution doesn't work. So if you think about it in a high level what kind of solution that we need is so between the farmer and supplier we kind of need a store that will store some sensitive sensitive information that should be seen only by those two entities. Similarly for every two entity I need some kind of private data store that would definitely solve the problem right like a high level idea. And then in the replicated ledger we could store like order ID, product ID, not ID and that could in fact be referred back to the private data collection. If I do hash of order ID probably I can get the invoice for that order ID in the farmer, supplier, private data. This is a high level concept. So let's look at what is the exact transaction flow that uses private data. So I am taking the same example but now you would see two new boxes, one is private AB. So here in this example what I am assuming is that there is a private collection between organization A and B. Any data that is stored in the private A and B will be only visible to these two organizations and it will turn P visible to other organizations. And then there is something called transient DB right that's more like an internal store. When I explain the transaction flow things would become very clear why the transient DB is used. So let's take the same example. So it's a storing of balance for an account ID and the simulation of transaction happens. Here in the simulation what exactly happens is that the account ID and balance is actually returned to the transient DB. And then when the simulation result is sent back to the client, it only include the hash of account ID and then the hash of balance because this is going to be included in the block and the block is going to be distributed to all the organization. So if I store the plain account ID and balance then everybody would get to see the value right. So that's why it kind of I sorry. So in the step two the actual value was stored in the transient DB and then only the hash of the value is sent to the client so that it will be sent to the ordering service for order. Then the ordering happens block is created and then block is delivered. So here the validation happens as usual. And then there is a new step which is called a fridge private data. So when we talk about implementation, you will find out that when exactly the step eight is executed. So basically each peer would check. So whether this transaction has a private data hash in the right set. If it has a hash, then it will check whether it is belonging to that collection or not. If it is belonging to the collection it knows that okay then there is that should be some private data in the transient DB. So it will try to fetch the private data from the transient DB. So if you remember in step two we actually stored the account ID and then the let me show that. So in the step two we stored it right in the transient DB and then in step eight it is been fetched. Then the MVCC validation happens again it happens on the account ID and then the balance because that is only is present in all the other organization as well. Then the commit happens. So at the end, you could see that in the private collection a and b the plain account ID and balance would be present. Whereas the hash of the account ID and the balance would be available at all the organization database. So this is how the private data collection works. So if there are any questions, I would like to take it. So that is a question as a trust is compromised with privacy or other way of taking it. Yeah, so Bala has to ask some questions related to a trust. So, so basically here, see for example in a blockchain network suppose say there is a four member that is present in a consortium, especially the example that I gave. If there are competitors or present in a single channel, then obviously nobody trusts right because the competitor could read my data. So, even if there is some kind of trust I don't want to take that risk. Consensus in blockchain is mainly doing ordering. It's not really a I mean, the agreement, etc. It's simple ordering of transaction. So I wouldn't put a trust on the consensus. It's basically the ordering node is malicious or not. And can we achieve privacy by using IPFS. Yeah, so one of the key. The reason for adding private data collection is that we do not want to rely on any of the external services to store the actual data and then store the hash in the blockchain because then the smart contract cannot access the data that is present in the IPFS or any other data environment. Especially, we did not want to do that. So that's that's our main goal to keep the private data collection within the fabric. Still, if somebody wants to use IPFS, they could do it, but they need to ensure that the smart contract is able to access the external world. So, Sudil has asked a question. Can you mention the read, write and double-spread version validation updating our private data? Okay. So, similar to public data, private data. So, as I said, there is a hash of the key and then the hash of the value that is stored in the public DB. Okay, even this entries has a version number associated with it. So, when a data is being read, it actually adds to the read set, what is the hash of the key and then what is the version it read. And then during the MVCC validation, because even the non-member has the hash of the key, they can actually validate whether the version has been changed or not. So, when we have a private data, it doesn't mean that other people does not have any information about it. But note that other people still have the hash of the content and we can associate any version number with it and we can do the MVCC validation. And how is the concurrency handled while ordering? If two orks sense update to the same account ID. Okay. So, currently the ordering service does not peek into the transaction. So, ordering service just to receive a set of bytes and then the one main thing that it does is access control, whether the submitter of the transaction is allowed to right to a channel or not. And then it creates the block and then send the block to the peer. And then in the peer there is MVCC validation, right? Currently we do it very sequentially. So, we take a transaction perform MVCC validation and then put the right set into a memory buffer and then we do the next transaction validation. So, currently it kind of serial. So, that kind of provide a serial disability isolation level. What is the impact of business reporting? Okay. So, when I talk about private data collection, obviously I don't want to show my data to others, right? So obviously they shouldn't do any analytics on my private data. So, that's my main goal. That's why I'm using private data collection. So, obviously that wouldn't be supported. Okay. No more questions. So, I will move on. So, next is the private data dissemination. There is a push versus pull protocol and the way that how the private data is shared between the members. There are two scenarios. First I will explain when the push of private data happens and when the pull happens. Okay. So, let's take a setup in which the smart contract has an endorsement policy saying R, A and then R, B. Okay. And then there is a collection with the member A and B. And when the organization B is submitting a transaction, because the endorsement policy says that to, I mean, the client can submit the transaction to any one of the organization, it is submitting to organization A. So, now the simulation of transaction actually stores the private data into the transient DB of, yeah. So, it stores the actual private data into the transient DB of R, A. But R, B doesn't have that private data, right? So, in the push protocol what happens is that at the end of the simulation and the private data is pushed to the other member of the collection. So, it pushes the account ID and then balance number to the other organization and store it in the transient DB. So, during that validation and commit of a block, the organization B also can get the private data. So, that's the push protocol. And in pull protocol, suppose say that the organization B is kind of unreachable in the case one. Then at the eighth step, when it is fetching the private data, it will try to fetch from the transient DB, but transient DB would return it because the data is not available. At the time it will pull it from the other member and then it do the commit. So, that's how the push and pull happens. Usually pull should be avoided. It's only used when there is a network isolation or there is, I mean the organization B is able to talk to the ordering service, but it is not able to talk to organization A to fetch the data through push or pull protocol. So, only at the time the pull would be used otherwise it shouldn't be. There is also something called missing private data. I will cover that in a very brief way. So, let's look at how to define private data collection for a chain code. So, this is the sample JSON in which we posit along with when we instantiate the chain code. So, first thing is the name that is the name of the collection. And the policy is basically members of the collection. And there is a required peer count that actually decides to how many number of peers the private data need to be pushed at the end of endorsement. The block to live is nothing but for how long the data that is stored in the collection should be alive. So, this feature was mainly introduced for GDPR requirement because in GDPR there is a requirement for keeping the data alive only for a short duration after that the data need to be removed. But because we don't have the time concept in hyperledger fabric so we kind of use block number that is the block height as the lifetime of the data. And max peer count is again something related to the push protocol. And then there is a member only read and member only write I will talk about it shortly. So, for a chain code, we can have any number of collections. So, there is no limit. However, there is some physical limit based on the system resources and how many database couch DB can add it. But I mean in order to create like four collections we need to have the four kind of this structure like name policy and all of the parameters. So, in 1.x, so we didn't have endorsement policy for collection, whereas in 2.0 that we have introduced so here I'm talking about why we have introduced that. So, let's take a scenario where there is a chain code called my CC and that is deployed on R1 R2 and R3. Okay. And there is a collection A and in which R1 and R2 are the numbers and the endorsement policy says that any one organization can submit a transaction here. So, so given that there is a collection A and that holds a private data associated with R1 and R2 only. I mean, it's basically we cannot allow R3 to write anything into collection one, right? So because it's store belongs to R1 and R2. However, before 1.x, if chain code hasn't returned any sophisticated access control, definitely R3 could submit a simulation request to R2 and then can write into collection A. Or the R3 peer could manually create that, I mean even the client could manually create the transaction right set and then put the signature of the peer because the R3 has access to the peer and they can submit the transaction to add data to a collection A. So we didn't want that. So that's why in 2.0 we introduced a separate endorsement policy for the collection. So that in collection A we need to define endorsement policy with R1 and R2 as a number so that R3 cannot do any transaction on a collection A. So we have three levels of endorsement policy. As you know, chain code endorsement policy is there. Then collection endorsement policy is what I define and more fine-grained is the key-based endorsement policy. This endorsement policy also we need to define as part of this configuration. So let's talk about read-write access control on collection. This is like an implicit read-write access control. Still user can add read-write access control by adding code to the smart contract. But we added some future to the configuration itself so that the peer itself enforces access control. So the same example, same scenario, now say that R3 client, I mean in 1.x R3 client can submit a simulation request to read some data from collection A. If the smart contract hasn't implemented any access control on collection A. So this kind of leaks some of the private data. So that's why in 2.0 we introduced something called member-only read and then member-only write. So member-only read, if it is said to true, the peer itself will check whether the submitter of the transaction is a member of a collection. If it is not, it will reject the transaction. It will throw error. So yearly error, the smart contract need to make this check because that every smart contract need to have this code be kind of made that code integrated into peer itself. Similarly, there is a member-only write as well. And in 2.0 we also introduced other APIs like getPrivateDataHash, especially this is used for moving data from one collection to another. But I'm not covering them in details but in FabricSample recently there is a separate use case dedicated for this API as well as implicit collection. So implicit collection is like whenever a chain code is instantiated by default there is a implicit collection with the peer as the member. So for example, so here there is a chain code myCC that is deployed out three organizations. So by default the org one has a implicit collection and then org two also has a collection, org three also has a collection. Only the respective organization can write data to it. Okay, and this kind of add a lot more privacy. And so that's what so I probably you should look at FabricSample because this itself will take I mean more time to cover. So that's why I'm differing to FabricSample repo. If anyone is interested in learning more about implicit collection and how to use a private data hash. Okay, so before that if there are any questions. So there is a question from Roger. If you are maintaining the private data, then why do we purchase the data? Is there any case that data could be purchased even on the block to live this set to minus one. So when the block to live value is set to zero, we do not purchase the private data at all. And so the first question is like if I mean if you are maintaining private data why we need to purchase right. Suppose say that I'm storing some customer data into the blockchain and then the later the customer comes and says that remove my data. So that's the GDP or use case so the customer can ask any time to remove the data, but we didn't have that feature like when you say that remove the data. We can we could immediately remove it but we didn't do that. So instead of what we did is we provided a block to live. Whenever the data is stored, the user can specify I mean like this data will be online in the blockchain for like a month, and depending on the block block height or block addition rate. Okay, so there is a follow up as well I mentioned a mistake. But the gossip was not happening properly. So usually the gossip, I mean, whenever the people set up the private data for the first time definitely there would be some error related to gossip, because when there is no private data usage. There is one configuration like the endpoint gossip endpoint that we need not to set, but especially when we are using private data and the private data is pushed and then pulled. That parameter need to be set. So because of that because it doesn't need in the public transaction and only it is needed in the private transaction. Usually people misses that configuration as a result, the data is not usually pushed to the other member. So if the configuration is set properly then it should work. And I think recently the documentation is also updated to reflect that. Okay, next question have a client app get an access to the endorsement policy. So that is something called a service discovery. And there is also some AP associated with the service discovery at the SDK and the application can use those API to find out what is the endorsement policy for a given collection or the chain code and even they can find out that whether the chain code is deployed on a particular peer or not etc. So the service discovery is the place to look for. I mean, if you need to know about the endorsement policy. And Sunil, would you talk about adding a new work into private collection as on need to know basis and how this sync up work. Okay. So when I talk about the implementation, I will talk about how we are. I mean, doing that in the implementation side, but in addition to that I'm not talking explicitly about adding New York, but we can add New York and the New York would pull the data from the block one to the current block height, but I will talk about it briefly because you asked that when I talk whenever that slide comes. So, then one more question what type of collection is private data stored in is it similar to state to be close to yes. So the data is still stored in the state to be only. So, wherever the public data is stored in the same place we are storing this private data as well, it just a separate name space so each key is just a prefix to the collection name. That's it. There is no other fancy thing. Then there is another long questions, but that looks like a debunking question so that I may not be able to answer it here. I think it's from galaxy empty sorry probably we can take it offline because I need to look at it it's looks a little bit okay. Let me go into the presentation. Yeah, sorry, I mean the heading is wrong it should be like a comparison of comparison between channel and the private data collection. So, let's look at when to use channel and then when to use private data collection so I have given the table so for your use case you need to decide by looking at the table. And so what is suitable for the use case that you are addressing. The channel actually provide a transaction level privacy so that's like a best privacy I would say because the participant information is completely hidden. The data is also hidden because one channel cannot see all the data that is present in another channel unless the members are like, I mean, overlap. And even the smart contract invoke wouldn't be visible because it doesn't have access to the block whereas in private data collection only data privacy is provided. But still the other member can see who has performed this transaction which smart contract got invoked and what are the parameter that is possible smart contract. So all those detail is kind of leaking some information and performance when we use large number of channels definitely it is going to improve the performance because with the channel there is a parallel processing of blocks. So whenever we process blocks in parallel the resource utilization improves. As a result, the throughput also increases, whereas in private data collection, I mean it's, I mean it's still a single channel block commit path and also compared to public data, there is a performance degradation so I will talk about it. And with the channel there is a data silos, whereas with the private data collection, there is no data silos because we can change collection membership, we can remove a member, we can add a member. And also we would be able to move data from one state to another one collection to another collection. And then can we automatically modify more than data, I mean data presented more than two channels, it's not possible, whereas we can do the same thing in the collections. So if we can submit a single transaction which can modify more than two collections provided that the endorsement policy is matching. So that is one advantage with the collection and development and management complexity is debatable, it depends on the individual is what I think. Because in channel we need to track off number of channels and then BLS etc. And in collection also we need to keep track of number of collections who are all the numbers of the collections. So in my opinion it's kind of equal. Yeah, so let's look at comparison between public and private data commit path. So now we will dive into internal implementation details. So in the left hand side, there is a peer component that is involved only during the public data commit. On the right hand side, there are more components because the private data has more number of databases and then storage components etc. So in the normal public data, so when the block is received by the gossip it is stored, it stores the block on the block queue. Then it takes the block and then give it to the endorsement policy validator which actually validate the transaction based on the smart contract endorsement policy. Then it goes to the serializability validator for MVCC validation. Then it comes to committer, then committer opens the block into block store, then apply the right set into state DB and then the index is created in the history database. Whereas if we take private data, you can see that there are more components, right? So one is private rights at a dissemination. So that what is involved during push and pull of private data and that is the one which takes care of a transient store as well. And there is something called missing private data reconciler and this is used in two places. One is neither push happened nor pull happened because some network connectivity issue has happened. So in that case, the missing data would be marked in a private block store like there is a missing eligible state is there, right? So that entry would be made and then missing private data reconciler periodically tried to fetch those missing data from other members. That is another new component and then in private block store, obviously that is also a new component because now we have a private data. This is mainly to store historical private data and you could see that there is something called missing ineligible state. So this actually stores all the states. It's like a hash of all the keys and then the data that is associated with collection for which the peer is not a member currently. So that's why it is storing it as ineligible. So somebody asked a question, right? I mean how the data is pulled when a new member is added. So when the new member is added, the missing ineligible state is moved, I mean the data presented in this ineligible state is moved from this packet to the eligible state packet. Then the missing private data reconciler will fetch the data from the other members. So that's how the addition of new member also happens. And then there is a purge db that basically stores the purge entry for each of the key and value, the block value value is non-zero. And then in the state db we store public state, hash state and then private state. So public and hash states would be same across all the peers because that is present in the block. Whereas private state would vary depending on the collections and whether this peer is a member of the collection etc. And then there is a history db and also there is a collection conflict db. This basically stores all historical collection definition for each chain code. This is used during the pull and push. For example, if a new peer is joined, it need to access the historical value. So that's why we have this database. So before going further, I would also like to check the time as well as some questions. So Arun, do we have more time or is it already? Arun, how many? So I think it's 45 minutes. So how many minutes do you... So if you give me 10 more minutes, I will quickly cover the rest of the content. Yeah. Okay, thank you. So let me quickly... Okay, I will take questions at the end. So this is the exact block commit path. So I can make the slide available later so that you can actually look at it. So these covers the exact steps in terms of code that has been called for both public data and then the private data. So let's get into performance analysis. So here we use four organization setup. Each peer is... each organization is owning a peer. And there is a private data collection with all four organizations as a member. So it's not a realistic one, but for performance benchmarking, this kind of give a simple setup for us. And still somebody can use this if they want to hide the data from the ordering service. So still it's kind of a remote valid... I mean scenario. And we used a small bank benchmark and so you can see there is a different set of APIs are implemented in the chain code. And so this is the endorsement policy, I mean collection that we have used for our setup. And so we are... I mean all are a member of the collections. So that's what I mentioned in the previous... Okay, let's get into the number. So in this graph, the x-axis is the millisecond and the y-axis is the throughput that achieved. So this millisecond is nothing but the block commit path. Given a block, how much time it takes to commit it? So with the private data, the peak throughput was around 720 TPS, whereas with the public data, we could get around 1,440 TPS. Obviously more than 50 percentage drop, but we could improve the performance. So that's what we did. So we introduced five optimizations. So again, so these are like low level implementation details. So I'm covering in a very high level. So we do sting based comparison for membership check instead of doing policy evaluation that we did it in 1.4 and 2.0. And we also purchased transient store in background rather than doing it in the critical path. Then the commit to block store and the private data store is it happened serially. So that we could do it in parallel, but it complicates the recovery mechanism. Similarly, we can commit to both policy DB and state DB in parallel. That also we did and use a cash and the transient store. So with this optimization with each individual optimization, I have mentioned what is the maximum throughput improvement that we are getting. So you can see that with all five optimization, we could get till 1,344 TPS. Whereas with the public data, it's 1,040, right? So we kind of close to the performance gap. So only 6 percentage drop in the performance for now. But unfortunately, not all the optimizations are available in Hyperlegger fabric. So, so I think first two has been merged and the one and two has been merged and the remaining three we are planning to do. Once the ledger checkpointing work is done, so probably it will be available yearly next year. So you can track those, I mean, Gira items here for the rest of the optimization. So currently, we could get till 1,000 TPS with the private data collection. Whereas with public data, it could be like 1,400 TPS. Again, this is not an official number because the number could vary depending on the mission and then the mission configuration. What is the use case and so many other factors. This is just for the setup that we have used. This is the number we have given. And this is one of the main recommendation. So whenever you are defining a private data collection, ensure that the required peer count and max peer count are set in a way that the push itself is alone. So push itself is disseminating the private data. There is no need for a pool because if you use pool, then it could significantly drop the performance like 60 percentage drop in the performance could occur. So these need to be set in a proper way. And also we need to set something called skip pooling in value transaction going coming to prove. So this also helps in improving the performance. And yeah, so we are doing more study like when a new member is added. So how much time it takes. And also if you have a large number of collections, how it is going to impact and also performance with implicit collection. Yeah, so this I will go to questions. So Vikram, so can one or own the collection but other or request access to the data or particular transaction. Yes, it's kind of possible. So for that what we need to do is we need to set the member only read two faults when you are defining the collection. And then in the chain code, you need to manage all the access control. So even you could have a chain code managed state to decide when to dynamically allow permission for a particular member. For example, you could store some secret into the chain code. And then you can share the secret with the other member. And when the other member shared the secret, you can allow that member to read the data from the chain code. So that that's kind of is possible. So the underlying transaction log would be different for our inside private data collection and for our outside the collection having only both being part of the chain channel. Yeah. So if you look at the log, the log is just the block store, right, that is the block. So the block would be same across all the peers. So the block because we compute hash. So there is a hash change that shouldn't change a thought. So that's why I said that we always include a hash of the key and then the hash of the value in the block that would be consistent across all the node, only the state DB data could vary. So some peer could have the private data and some peer would not have the private data, but still the block store being the source of truth that would be consistent across all the peers. And what was the block size during the performance test that was the hundred block size. And what exactly the party DB and history DB purpose I hope the state DB will be referred for and reiterate set for validation. So can you please clarify that point. Okay. So the party DB what it actually stores is the the purge schedule for each key value stored in the private data collection. So suppose say we defined a private data collection with the block to live as 10. It means that any key value that I store in the private data collection will be alive in the peer only for till the next 10 blocks. Okay. So in the party DB we kind of create an entry saying that on the 20th block commit this key need to be removed from the state DB. So that's the party DB work. So it's basically stores all the entry that need to be removed whenever a block commit happens. So that's what the party DB work is. And history DB is basically a index to the block store. So there is a get history for key. There is a API right. So using that API, we could get history of his all value for a given key. So it's basically indexing to a block store. It's like a which file and then which offset etc. So that's not exactly the complete story. Yeah, so I think there are no more. There is one more question. So what is the block time for the performance test. That's one second. So all the performance test I do or I keep it at one second because we also measure the performance like throughput per second right. So, so one second is good for reporting performance numbers. Is it not advisable to use get private data query result while using private data keeping the purge in mind. Yes, yes, that's true. And even so that could be missing private data also right. For example, if network partition has happened, then the matching data would not be present in the state DB. So even in that case, the get private data query would enter even when there is a purge, it may not return all the data. So that's why with very clear, I mean careful design, we need to assign the block to leave and other data, but you are correct. Yeah, thank you, Santil. Nice and so much deep technical like your blog. Oh, thank you guys. Yeah. Yeah. So, next we have Muhammad Usama Saddar and he's going to talk about how the attestation work in the Hyperledger Avalon and Hyperledger Avalon is a trusted education framework, compute which is based on the Intel SGX CPUs where you run the your some kind of services and programs in the CPU in clear memory, some kind of privacy features. So, all to you, Muhammad. Thank you very much. So can you hear me clearly. Yeah, very clear. Okay, perfect. Perfect. Okay, so let me share the slides. Okay, so let's start. So I'm Usama and I'm going to talk about the formal foundations for attestation in Hyperledger Avalon. And if you're not familiar with any of these three terms, I mean this formal foundations or attestation or Hyperledger Avalon, I will give a brief overview. This work is under the supervision of Professor Christophe Fetzer, and I would also like to thank my colleagues, Rasha, Donald, Franz and Saith who were involved in this work. So, the outline basically is that I will introduce the three terms that I just mentioned in the title. And afterwards, I will describe about some related work which has been done in this domain. And thirdly, I will talk about the current attestation mechanism which is used in Avalon, which is called as EPID. And the future attestation mechanism which is currently being developed, currently being integrated in Avalon, which is called as decap. I will describe that briefly. And finally, I will describe the formalism that we used in order to analyze that. Okay, so briefly about Hyperledger Avalon. Unlike the three talks that you have attended, it is a ledger independent tool which supports a fabric and ethereum, for example, and at the moment. And the main aim of this tool is basically to improve the blockchain scalability and the transaction confidentiality. So, how this is done, this is done by implementing the off-chain trusted compute specification which is published by the enterprise ethereum alliance. And there are different kind of trusted compute possible which are supported for instance zero knowledge proof, and multi-party compute and trusted execution environment, trusted execution environment is the solution which is selected for Avalon. And there are reasons for that so I will explain when I talk about trusted execution environments. And before that I would like to talk about the security paradigms and why is it important or what is new in this. So, the data security domains if you look at very high level it consists of three main paradigms. The first one is the data at rest for instance the data which is residing in your heart is the second one is the data in transit for instance in unposted public networks and the third one really is the new paradigm which is the data in use which when we have some computations being done on the data and this is really critical because you need for the most part applications need to have the data in the clear in order to operate or in order to do some operations on them. So that is why it is important. So hardware based trusted execution environments basically are the hardware based techniques to isolate the data from the entrusted entities which involve not only the OS, but also other entrusted entities. For instance, this is really important when you have an application and you want to run it in an entrusted environment such as a public cloud. And then you need to have some isolated environment created for the competitions to remain confidential. So this is really where trusted execution environments are good and various solutions have been proposed for instance in 10 secs. So these are the three major solutions for trusted execution environments. I have a non currently uses in 10 secs for the implementation. Right. So what what is attestation and why is it so important in 10 secs attestation is roughly a way of giving trust to the challenger, the challenger meaning here that when we have the off chain computations when we want the blockchain to do some attestation off chain, then that is the challenger basically asking the worker to do that computation and we want that guarantee or kind of a trust that the right application or the right computation is being performed inside the right platform so that we can have these different entities and the kind of verification which is done here is for different entities and one of them is a major one of them is that we know the identity of the enclave and the enclave representing here the protected region of the memory where the computation is being done. So we verify basically that identity of the enclave is what we expected. And another important thing is the validity of the platform that the platform itself on which that computation is being done is is the correct one is the right one and is not under the control of the adversity or for instance is not something which is being simulated and giving us the results so we need to verify that it is the right platform. This is very important because we need to provision the secrets to the enclave and before provisioning we need to have some kind of guarantees that our enclave and the platform are really what we are expecting before we can put our secrets into it so we need to consider the attestation mechanism. Talking specifically about the attestation in Internet SEX as I mentioned that Avalon currently uses Internet SEX. So the attestation process in Internet SEX is of two types. The first one is local attestation where we have two encraves. So we have two encraves which are inside that platform and they are trying to attest each other that both of them are residing on the same platform and the second one is remote attestation when one of these encraves proves its identity to a remote platform then this is called as remote attestation. Then we have the two main types of remote attestation as I showed in the outline. So the first one is applet enhanced privacy identification and this is what the currently being used in Avalon. This basically is based on the concept of privacy that your identity remains hidden and then the second one is more in order to is called decap data center attestation primitives. So this is more in the concept of giving you making a full infrastructure of attestation where you do not need to contact Internet each time for the verification of your verification of your codes and so what are codes so I will explain that in the coming slides. And the third part of introduction as I described in the title there are three things so the third part is really this formal methods and what they do and why they are so important. So here I have mentioned that they are basically some mathematical techniques which guarantee that the models satisfies its requirements. So we have we represent the system in mathematical way and then get that it's this model that we created has the satisfies the requirements that are done. So here is a motivation for that why we need for methods there was a security protocol medium shooter protocol which was used for 17 years and then after 17 years it was discovered that this had a flaw in the protocol and this was discovered using formal methods so you can realize that the potential of the formal methods in security is really great and this is really what we are trying to do along with the implementation of the hyper legible on in order to formally guarantee that the system satisfies the requirements. Right so here I show the flow of what is done inside the form methods in general we have a system which is then out of that system whatever system is given to us for instance let's consider this attestation protocol or at a larger scale the hyper legible on system. So we create an abstract model of the system, which is that basically we have created a mathematical representation of which of the system which we can analyze. And then we have the requirements here the requirements meaning that what it should satisfy and what we want is that this should be represented precisely in mathematical form so that it can be verified so we call it here. The specification meaning that they are represented as properties and finally what we have is that we check that this abstract model satisfies the satisfaction specification sorry and if it satisfies we have some results in the form of verification proof and if they do not satisfy for instance then we have counter example which can show that under this scenario the system does not satisfy the model of the system does not satisfy this specification and then we can go back and check the system model was there a problem in the abstract model or was there a problem in the conversion or the translation that we have done in order to model that and then we can rectify the problem and follow this procedure repeatedly until we get that the proof of the system can be done. So this is the overall idea of what form methods are and this kind of work is not new so there are some related works which have been done specifically related to participation and I will briefly describe the two most prominent of them. The first work was by a group of researchers at UC Berkeley and MIT what they did is they proposed some an abstract way of formalizing attestation, but specifically for Intel SCX they do not provide the proofs and this remains a gap then Intel has in some of the documentation mentioned about some of their works that they have done for this and some of the tools that they have utilized in order to verify this and for instance they mentioned that they have some for the in the sequential settings they can use this deductive verification framework which is a tool developed at Intel and otherwise for the concurrent settings in which we have multi threads so executing in parallel. This they have developed this ipave tool which is a graphical framework and its automated version which is called as accordion. So the problem really here is that these tools the three tools developed by Intel which they mentioned in the literature that they have done for the verification of Intel SCX remain out of the reach of the normal public and so that the a normal user cannot have the guarantees that it was or let's say what was verified and what kind of system model was there. So this remains a limitation and a lot of new attacks have been discovered after in the last few years and that limits the confidence of the user in Intel SCX so that's why we need to provide some more guarantees to the system and this is what we really do by making use of an open source tool and allowing user to get more confidence in that. Okay, so a brief comparison between the proposed approach what we did and what was proposed as I showed in the last slide about the three tools by Intel. So the first comparison on the basis of concurrency DVF does not support concurrency so it is sequential as I showed last in the last slide. And I think and according on which were for the concurrent settings they do not allow non determinism. And as I already mentioned so none of these tools as well as the proofs that were done are not available to the public and just to be fair so our proposed solution is does not have too much implementation details. The idea was that to verify it to begin with to verify it at the symbolic level and then maybe as the implementation details or more details are available then we can move it to the implementation tools as well. So briefly about the current attestation mechanism first that is in use in a non currently so there are a number of entities which are involved here so I will give a brief introduction of all of these and then I will describe the protocol. The first one is application enclave. So this is that protected region isolation environment which is created by Intel SCX and the second one is the application itself that is and then the coding enclave QB basically this is the enclave this is the Intel architecture enclave which is provided in order to verify the reports and in order to and then if the reports are correct they after verification it can sign the code sign to sign that report to generate the code. And then finally this is the provisioning certification enclave PC which is the local certificate authority for this coding on claims. And so these four entities basically form the user platform and the challenger on outside the platform is basically that entity which is trying to verify that code or the application is running inside the enclave so this challenger or the relying party is that entity. And finally in epic we have this internet attestation service is which represents the Intel architecture which is going to verify that the reports are correct. Sorry the codes so after conversion they are outside the platform we have these codes so I will describe in the flow later so what's the difference between these and then there is a communication between all these entities. How they interchange information with each other and that I describe in the next slide. So we have all these entities again intercoding enclave, the SEX application enclave SEX application itself and the remote challenger or what I mentioned as in the previous slide as challenger here. Okay and then the internet attestation service. So the process starts with the remote challenger and just to remind you the intercoding enclave or sorry so the SEX application enclave basically in case of Avalon here represents that worker which is doing that operation. So the process starts here with the remote challenger which sends a challenge including a nonce to the SEX application. Then the SEX application will give information to the application enclave about the MR enclave which is basically the hash of the information. And this basically is this QE Kotika enclave is what I mentioned here that will be the target for this report that will be generated by this worker. So, and the only target which is mentioned here as I will show next in the next slide that only the target which is mentioned here will be able to verify that report and no other target will be able to verify that report because it is based on. The key which is important. Okay and then this will the SEX enclave or the worker will then create a report which will itself contain the MAC over that data. So basically inside this local attestation or user platform we have this. We are using this metric key cryptography and specifically using the MAC to protect the data and then this report after generation is sent back to the application so I will just quickly switch to the next slide and then I will come back off about how this report is created. So the generation of report is here as you have seen that it needs some target information as I mentioned that it needs the MR enclave of the coding enclave, which means that that particular information is added here. And the key is basically derived based on that target information. So which means that only the target which has which will add that and have this information already in its enclave control structure will be able to derive that key and will be able to get access to the data. Sorry, we'll be able to verify that MAC. So the process in short is like this. So the report body is the major part of the report and other than the report body we have two other fields which are and here I show that the report body becomes a part of the report. We have the value for the key we are out protection. And finally we have the MAC which is generated by this process that is that becomes a part of this report and this pack is actually created computed over this report body. And the key use for this. Computing this MAC is shown here. And what I want to show here by this sort of ugly looking arrow is that in the beginning the value for TV protection is stored in a temporary report buffer and then that value is actually used because they can be an attack in between. So that value is then used in the right in the derivation of the key. And other than that we have this target information as I mentioned previously which is coming from the application which describes who is the target for this for whom this report will be created. And then there are some other data like which will be involved for this key derivation and based on that this derived report key will be generated and by target I mean this is specific to that target. No other answer will be able to you derive this key right and then I think that's all and I come back to this slide where so this was the process which I showed here the creation of the report containing the MAC and this report will then be sent back to the SEX application which will forward it to the Intel coating on crave. And here you can see that this report will be verified because it was targeted for this coating on crave with this amount on crave and then it fill in this and I will now switch again to this slide so what this will do is a similar process for this. For this key derivation because instead of this target information might it will use its own information from the SECS register, which is the SEX enclave control structure. So, so basically it will use now it's on and now, because it was at the center and because it was generated using the target information. And now it can use its own and our enclaves so basically the same key will be generated at this end. And coming back so it can basically compute or calculate this key and based on that it will compute the Mac with that key will compute the Mac over that report body again, and if the two Mac matches. If the two Macs match then basically the report will be verified. And once the report is verified we can then check the hash values inside that report and if everything is okay so it will sign it with the epic private key, which it has been provisioned by the provisioning certification enclave here as I showed that this is the local certificate authority for this. And then this will generate after signing this will generate basically a quote. And then that will be sent over to the SEX application. And which will just forward it to the remote challenger. And now since this has been signed with its private key at this end it can be verified using its public key that these data frame is actually correct and or the report or the quote actually is correct. And then, after the verification of the signatures, it can also verify that the anchor status and everything is okay and the platform is not simulated one and so on. So it's it, the app it mechanism basically depends on the internet station service, which adds some lab to the verification. And due to this, due to this limitation basically, whatever known is trying to do is to basically switch to, or provide a secondary mechanism for attestation which is this decap or the data center attestation primitive so I will briefly describe about that. So all the entities you see here are actually the same except this one which is the internet station service is actually replaced by this this caching service. And what this caching service represents is that you have a local service which which can be at different layers and what this can do is that you can verify your So you have a local infrastructure and there you have cash that you cash all the certificates which are required for verification and all the revocation lists and based on that you can actually perform everything in house you do not need to contact until each time for the verification of the quotes. So this actually solves the availability issues, like for instance if internet decision is not available then you are stuck. And so this basically solves that availability problem, as well as leads to increase in the performance. Right. So basically, here we have this kind of a flow of the attestation process where you see a couple of libraries are involved which are code provider library and it also consists of the code generation library and provisioning application enclave which is the local certificate authority for this coding enclave are both part of this library and these libraries are basically to support kind of provide you the template which you can utilize for creating that infrastructure which decap is used for. And then the process is like this that is the application. Yeah, so as the application basically wants to get the SCX target information, which is like in the biggest and we had this application, which had that information to begin with which was the MR enclave of the coding enclave. And in this case, I've elaborated the process here which is here that the coding enclave will send its PCK meaning this provisioning certification key. And this is a private key of PCP provisioning certification enclave and this will when and the coding provider code provider library sorry we'll provide it back with the PCB and the coding enclave certification data and based on this PCB which is received. It will generate its asymmetric key pair which is this attestation key and its public part. So after at this key pair is based on TCP so whenever TCP changes so basically these keys will also change and after using this after generating this key pair that will be sent this public part will be sent over to the provisioning certification enclave along with the coding enclave ID as well as the PCB which is which it obtained from this code provider library and the PC on this end will generate the its own key pair which is PCK and this is private key and its public part and then sent over a certificate back along with its public key and the coding enclave will then send its target information back to the application. So it has now that application now has this target information which it can send to the application enclave and the application enclave can now create a report based on this target information and then this report will be sent back to the application which will send it back to the coding enclave and here the process afterwards it is a bit similar there are a few differences which I will show in the next slide. Basically what happens here is that it will verify the report as before because this target information is now available to it. This report was basically generated with this specific target which is this coding enclave so it can verify this report the concept here is again the same but now it is based more on the certificate chain rather than this private key so then it will after the verification of the report is everything is fine then it will sign the report body with the attestation key that is its private key and it will generate a code and send the code over to the SCX application and then so this can be used to verify using the certificate chain based on the root certificate and the attestation process will then complete. So I now give an idea of the structure of the code how it is built so the main thing here or the main kind of fields which are important here are the involving the security version numbers coding enclave as I mentioned is a part of the Intel architecture enclave and it is its security version number is important in order to see whether this is up to date or not and its local authority that is provisioning certification enclave you have its security version number and then the user data field can be used to communicate the keys in order to establish the channel and then we have this the actual report body which was generated here so meaning that this report and then this code signature data length because this field this code signature data is a variable length structure so we need its store also its data length and this code signature data itself consists of all these fields and important here is also that we have a report for the coding enclave which we can check that it contains the right measurements and everything so we need also the public part of the attestation key and another important thing is this coding enclave certification data we want to check what kind of certification we want and the size and the certification data again the certification data can be variable length so we send its size along with it so now I will describe the proposed formalism which we have selected in order to formally analyze it as I mentioned in the introduction that so we deal with the formal we analyze it using formal methods in order to provide some mathematical guarantees so I hear now describe what kind of formalism we have selected and the reason for that okay so the workflow of the approach is like this we have a system configuration consisting of all the entities which are involved in the attestation protocol whether it is epic or decap and you have seen that for epic there was inter attestation service for decap there was this caching service and some extra code libraries were involved the operation policies basically capture the communication and the computation done by these entities so there are two main things like what kind of computation is being done by each entity that can be so the computation can be modeled as a function formally and the communication is modeled by using the channels and based on these three things by three I mean this communication and the computations and here the entities we can generate the symbolic model that is so if you remember the beginning slide of introduction we are on the left side where we had model of the system or the system itself and then we create a formal model so we are on that step where we are creating the symbolic model so the symbolic model we generate is or the formalism that we use here is applied by calculus and the reason for using this is that it is it has more information as compared to using simple tree automata and otherwise also it provides you currenties for unbounded number of channels unbounded number of messages and it can create unbounded number of sessions between these entities so which are the kind of currenties that we want to have so on the right side as I showed before so then we have the requirements in the general terms now we have the security goals we are trying to what kind of goals we have from the security perspective it can be confidentiality integrity for instance and we formalize that again in the form of security properties or specify them as security properties and we combine these two and what we do here is an automatic translation of this symbolic model into that there is an automatic tool which can do that so I will that is named as pro vera for instance can be utilized and which is what we utilized in order to do this translation to the horn logic and basically the symbolic model is then represented by the horn clauses and the security properties are then represented as the derivability queries on these horn clauses and then we have a process of resolution which means that we check whether now these derivability queries are derivable on these horn clauses or not so if in case the fact is derivable on these horn clauses then we try to perform this attack reconstruction at the pi calculus level which was this model which we created and if that succeeds there are two possibilities so if that attack reconstruction succeeds and this box I have made basically to show that this is automatically done by many tools including pro vera so which is the one we utilized and if this attack reconstruction succeeds which means that this reconstruction at the pi calculus level which was this model is correct and then we have successfully found a counter example to the property that is this property fails on this system this given system for a specific reason which we can then explore what was the reason for the failure of this property and because I mentioned that the kind of grantees we are getting our for unbounded number of sessions and unbounded number of messages between these parties so which actually you can imagine a large state space and it is really possible that the tool cannot reconstruct the attack at the pi calculus level and in that case it will result with don't know so we have to then go and check in depth that what are the kind of problems which are possible here and then another possibility here is that I mentioned here the case when the fact was derivable and then if the fact is not derivable which the tool will return as true which means that basically the security goal that we were trying to verify on this specific given model is actually true or actually this security goal is satisfied. So, yeah, right, so the challenges in specification of the protocol itself. I must say that the literature and even the formal documentation, which is available from Intel is not so well described, and that is the kind of best words I can use for that so, for instance, this is Intel SCX explain document which is in fact the most cited documents in the literature and about the report key derivation I described you the structure of the report, and this claim that the padding for this report derivation is for the case of e-report instruction which is basically for the generation of the report where I described you that that it has a target information coming to it so in this case the claim it is hard coded and in the case of e-get key instruction it is obtained from SCCS. While we explored the Intel software developers manual and 2019 as well as its older versions to figure out what was the kind of reason for that but it is actually the reverse as evidenced by this Intel's literature. So, as I mentioned the Intel literature itself contains some problems so there were various ambiguous statements which we could not figure out what they wanted to mention or were really the motivation for going for the formal methods to understand more about how the process actually happens for instance they write that the coding enclave report is a report when the coding enclave report is certified which actually makes no sense to us and not only that there is in fact a lot of inconsistent information which is there in the Intel's official literature talking specifically about this remote attestation process. So there was really a need for a formal specification of or a precise description of the protocol itself and based on the recent attacks in the last few years and how to come up with a better or secondary attestation mechanism. So what we did is now describing the right part of the flow just to go back a little bit I am now describing this security goals how we formalize these properties and then how we can adjust an overview of how this process how we model that. Okay, so basically a confidentiality of the data is very important in this case as I mentioned that it is the main aim of Hyperledger Avalon in order to provide that and the specifications in order to meet the specifications and the the challenger basically in this case again represents that reward party and it has when it sends an encrypted secret to the platform we need to ensure that the data is not available in the clear to in the clear to the attacker and this can be represented simply as a reachability property. So reachability property meaning that we want to explore a state in which we want to check that there is there any state in the system where an attacker can get hold of the secret this secret which was sent in encrypted form. In its plain text form to the attack is available to the attacker or not. So the second important property is integrity and here we had to. So the way of doing is or a simple way of doing is is to have a couple of states which are the events, which is called improverive so message unchanged event and a message accepted even just imagine like two entities are talking to each other and when we have a receiver if the receiver can check whether the message is unchanged and state where a message is accepted by the receiver. So for every message that is accepted by the receiver we need to have. We need to have a state for which the message was unchanged and that provides some guarantees, but the problem is that there can be multiple messages which can be accepted corresponding to this message unchanged state. So, and formally talking about formal languages so this will be called as correspondence assertions that we check the property that for every message which is accepted by the receiver, there was a message unchanged even before that. And here have a duplicated that same scenario and what I want to show here is the solution to this is that we have a one to one correspondence between message. ACC is the short of this message accepted and UNC short for this unchanged to fit it here in the single line. So, what I do here is that a message accepted for each message accepted even we can have a correspondence corresponding a message unchanged even so that now establishes a one to one correspondence between that and now I can check that and this scenario will now no more be applicable that multiple messages may be accepted by this unchanged from this same message unchanged event. So this is kind of an overview of how we can view from overall scenario this injective formally called injective correspondence assertions and Additionally, we need to check here that the reachability of message accepted event is also there. So why we need to check that the reason is that if this event actually is not reachable. The query will still result answer of two, which will be misleading because there was no message accepted and then but the event says that or but the tool says that this that everything is good sort of right so so basically what I mean is we are checking the property that for every message which is accepted there is there is a corresponding message unchanged before that and when in the case that we have no message accepted at all. It will also return an answer of true that everything is fine but that is not what we want we also want to have some message accepted and then we want that for each message accepted there is a corresponding message unchanged event. Right, so that is why this we need to in addition to these injective correspondence assertions. We also need to check the reachability of a message accepted. So this is the kind of overall at a high level and idea of how we verify the properties for using this program if to and now I summarize the work. So we have, we know that so everyone is a tool for improving the scalability and the confidentiality. And we have seen the specification or the flow of the internet secs epic and be kept the two decision mechanisms which are provided by Intel secs. And I also want to mention that Intel secs has been selected but it is, of course, not the only possibility that development currently has been can be implemented on other trusted execution environments such as EMT secure processor. In this process of specification we have discovered various discrepancies, even in the official literature as I mentioned a few of them. And we analyze the confidentiality and the integrity on the symbolic model and it was the kind of two most important properties which are important in the scenario for instance the confidentiality that our data remains protected and the integrity that our data is unchanged. And the future work that we have in this particular direction is like Intel recently released or not released actually announced its trust domain extensions. And there they give more power to the trust domains which are in comparison to the legacy virtual machines. They have actually released also the specification documents. And we are going through that and we have, we can foresee that actually in the future, Intel is going to utilize that for a combined attestation mechanism between the trusted domains and the secs framework. So, probably that is going to be in use for both of them. And, but there is currently, it is not in the timeline for Evalon to move to that and otherwise, one of the things we did not consider in this work was about side channels that we do not deal with these. And we assume that the cryptography is perfect. And the reason being that the internet secs model itself does not cover the side channels so it is the responsibility of the enclave developer for instance to ensure that there are no vulnerabilities inside the enclave and side channels cannot be exploited. But of course they can be considered and there are various mechanisms which can be utilized for that. There are different tools available for that also. And there are information theoretic concepts which can be utilized here in order to provide some abstract guarantees at least. And finally, as I mentioned so the implementation is not limited to internet secs as a secondary thing we can provide some guarantees using or some implementation using some other process execution environments including the AMD secure processor and AMD has also recently released its SCV S&P version which is providing also many guarantees including the remote attestation and there it would be nice to explore how what kind of guarantees or additional things we can get from that AMD secure processor before going into the detailed development phase it is we can have idea from the formal model that how it can be useful. And here I list a few of the key references which are which are used in this presentation so I mentioned about Intel SCX Plain document and these are the two tools which Intel Intel mentioned their tools tools and there are some software developer manuals available and and these are two papers so we can so these are available and you can go through them all documentation is available. And so thank you very much for your attention and you can contribute to hyper legion of alone I have the link here the slides will be available and of course you can Google that hyper legion of alone and we have here also link for the project updates what kind of formal the kind of formal guarantees that we are providing and to deal with the attestation mechanism formally and then to provide a better attestation mechanism in the end in order to utilize. And of course I will try to answer the questions that I can now and otherwise if you have any questions in the future you can contact me by email. So that's all from my side and see if you have some questions. So I think there are two questions like why we are using symbolic model instead of computational mode. Okay, right. So that's an interesting question. So, yeah, so basically, the reason for using that was that it's always good to start at basic step. So symbolic model and the, or if I go back to the slide just a moment. So, okay, so this one. Yeah, the symbolic model that we generate can be utilized for the computational model actually, and but it is not the reverse so for computational model, the first thing is, or the first problem is that you need a lot of details, which unfortunately hasn't made public. So the symbolic model and was very basic and was a kind of starting step where we could go ahead with that. And of course we can, when we have these details, and we can go to the computational model in order to get some further guarantees for that. Yeah, thank you. So I think there's no question. Yeah. Thank you. It was very insightful talk. Yeah, welcome. And so that brings us to end of the part one of our four weeks, three part session. And then next week we have interesting topics lined up as part of our event. In next week, you will get to hear about the production grid deployments and the tools around which using which you can deploy a production grid, hyper ledger fabric network and then how do you measure the performance. Once you deploy the network, how do you visualize what's happening in a blockchain network using Explorer tool. And then we will also have a blockchain automation framework, which is an automated deployment tool. Which was contributed by Accenture and now it's in labs. They will be speaking about the things which are considered as part of the AF and how easy it is for us to deploy production grid fabric network. Interesting topics lined up for next week. We will share out, I have put registration link for November 7th event. And then please feel free to ask us any questions. Do look out for our LinkedIn notifications posters. And today's event will be posted on YouTube channel. Yeah, thank you all I think we can wrap up now.