 Welcome everyone. So today I'm going to talk about how to audit the enterprise blockchain efficiently. I'm Bohai Yang from Oracle and I'm the principal architect of the Oracle blockchain team. So today I will cover several topics. The first one we will share why we need to do the auditing when you're using blockchain technologies in the enterprise environment. And when you want to do the blockchain auditing, why it's challenging and what problem you need to resolve. And if you want to design or implement an auditing framework, how to do it efficiently. And then at the last I will show the efficient design of the auditing framework. We can see the components in the system. And finally I will show the evaluation results to demo how quick and the performance we can achieve. Okay, so why do we need the auditing for the enterprise blockchain? I think there might be four major reasons. The first one is for the platform users and the operators. They need to know what has happened with the platform. What operations have been taken and what transactions have been received by the platform. And the second major reason is sometimes you want to improve the performance of your blockchain system. And you need to figure out where is the bottleneck. And the third one is related to the security. It's probably when you find something is not normal in the system, you need to figure out what happened. And the last one is related to the policy requirements. Basically there might be some compliance requirements or other requirements that need you to do the auditing. Okay, so when we do the auditing for the blockchain system, what exactly needs to be audited? There might be a number of resources that you want to do the auditing on. But visually we can categorize them into three dimensions. The first one is who is using the blockchain resource. For example, who is using the peer, who is using the order. We can figure out the fabric for example. And the second one is the transaction level. So what transaction has been received by the platform in a specific date? Or how many transactions hasn't? And the last one is if you may want to know is everything okay with the system? Is there any data or activity is not expected? So all these things need to be audited if you want to do the auditing for the enterprise blockchain. So actually auditing is not a new feature for the information system. Many other technologies are already using the auditing features. Including those websites, enterprise information systems. And taking the websites for example. For these websites they are typically using the auditing widely for two major purposes. The first one is that they want to paste the performance issue. They especially want to find the bottleneck for the performance. How soon the page can be loaded and how soon the link can be jumped to. And the second one they want to have or achieve SEO. They want to improve their ranking in the internet. And from a website auditing the most used way to implement the auditing is to insert some tracing codes into every page. Then we use our visit to the page. The tracing code will record the information in the back. And the second one is you can do it offline. You need to analyze the service log files. Especially for the load planning server or the web server. You can see from what IP has visited our specific page. So here is an example for the websites auditing results. From the page you can see it records the last floating days. How many sessions happened, how many users and new users. And how soon you can landing specific pages. And also when the user exits the page from one page. And including the landing performance and the page depth. And also the bounce rate. All these metrics can still use some flow to understand what happened with your websites. And how to improve it. So if you want to adopt these mechanisms into blockchain auditing we will find it kind of challenging. Basically the blockchain system is different from the websites architecture. The blockchain itself is always distributed and multi-party. So it's different than the ARIGINAL 3 to add some auditing codes into every component. And what is lost? We all know that adding trees into the system will slow down the performance. Which is not very common today. And enterprise blockchain typically when we use it in the production environment we will enable the locking level at least with the warning level. So if you want to analyze the log files and to find the audio information you will find a few. So all these factors will make the blockchain auditing challenging. So with all these challenges how to overcome them and implement efficient blockchain auditing. So we need some observation. The first one is we found that with the blockchain actually the members of the network typically they will share the same ledger data. Taking the HUB ledger fabric for example. There is a channel and I remember in the channel they will have exactly the same ledger data. They can see all the activity inside of the channel. And all the transactions in the ledger in the channel will be recorded into the ledger and it is stored in the blockchain here locally. And we also know that in order to generate the blockchain space we only need the ledger data. So it's the fundamental to generate all the data. So with these observations we will see that if you want to analyze the ledger data directly it is always difficult to prove true reason. The first one is the ledger size is usually a lot. We have seen the ledger size in a single channel over hundreds of gigabytes. And what is worse is with more and more incoming transactions the ledger size is always growing. So if we want to do the blockchain auditing based on those ledger data we need to design an efficient algorithm and also we need to design an efficient system to make it high performance. So this is the basic overview of the proposed auditing system. You can see we have client, we have API handler, ledger Wi-Fi, BB, and the ledger. The client and the ledger is without auditing from a group who majorly looks on the API handler and the ledger Wi-Fi and the BB here. So the first one is the API handler. API handler typically provides those auditing RESTful APIs to the clients. Based on our story we found that most users would like to use the RESTful APIs to get the auditing information. So when the client sends the RESTful API request to the API handler the API handler here has two options to do the response. So the first one is the API handler, a big data time directory with stored results from the DB directory. This is a recommended way because it allows a very quick response and it also allows you to decouple the API handler from the ledger Wi-Fi. And the second way is the API, if there is no such DB storage then you can let the API handler directly call the ledger Wi-Fi here and the ledger Wi-Fi will get the auditing result on the fly. So in this case if the ledger volume is large then the response might be slow. So here are some examples of our images in the Auditing API. You can see with the first API we returned those auditing information for a specific panel and the following we can return the health status of the nodes at a resource usage and also the invocations and the number of billable transactions and also those static in terms of blocks and more. With all these information returned from the Auditing API you will know everything with the blockchain service. The second major component is the ledger Wi-Fi. This is the core component of the entire system because it will directly process the local ledger to check whether the data is integrated according to a specific criteria here. So the criteria is we can have a very flexible criteria. For example we can simply check if the ledger file is connected correctly and we can only check the data hash and we can also check the hash is chained. The previous hash is matched with the previous block and we can check more data here. And in terms to provide a high-performance processing speed we can have two processing models here. The first one is for processing. It means we will process the ledger from the very beginning. It may take a longer time if the ledger size is big. And the second one is what we call incremental processing. We do not process the ledger from the very beginning but from the position that last time we had processed and we can also record the processor position into the external database. So this incremental processing is very friendly for high-performance requirements. The third component is what we call the BP. It will solve the results by the ledger wire flyer and also you can solve other necessary metadata information. For example, you can make the framework in a distributed or clustered deployment. And someone may have a question that's how we can protect the DB data or avoid modifications from ledger. So in this case we always recommend you can deploy the DB in a safe environment. You can use a self-verified DB like the blockchain people and even you can post the DB in another blockchain. So here's an example of the results. The results show the valid on the blockchain ledger. You can see the timestamp, how soon it is processed and the ledger file, ledger height and those block numbers and the transaction number, config block index and also the hash values. This example shows if the returned message is invalid. You can see using the tool we have identified the previous hash mismatch at the block 2. There is a expected value, but the pre-hash value stored in block 2 does not match it. And now we also return to the failure ledger file. With all this information you can have some operators to check what has happened. So in order to find the more details of the blockchain auditing we also provide those statistics within a single block. Basically it's those hash value, previous hash value and the data hash and the number of transactions and also the size. So we have several performance evaluation. The first one is to test the system throughput, the transaction per second. You can see there's two curves on the chart. The top one is we only do the analysis, we do not do the verification. And the below one is we do the analysis and also we do the verification. And the x-axis is the number of CPU calls. You can see with one CPU call we can achieve 50 out of the TPS with the verification involved. And without the verification we can achieve around 150,000 TPS. And I also want to mention that this firmware can take advantage of comparison. With the more CPU calls, definitely the speed will increase. You may notice the trend of the curve is not linear after three CPU calls. And because with more than three CPU calls in our test environment the IO will become the bottleneck. So the second performance evaluation is related to the memory allocated. And here we test with different numbers of blocks from 1,000 to 128,000. You can see the allocated memory is always stable around 60 megabytes. So it means the auditing firmware is very stable in the memory usage. We also collect the data with total cumulative advice. That means that the memory has been allocated during the processing. You can see it's nearly linear with the number of blocks. That also means that the memory usage of the firmware is very stable. And this chart shows the number of DC times with the number of blocks. You can see the curve here is also linear. That means the latency behavior of the firmware will be stable. Okay, so that's all of the presentation. Thank you. And let's see whether there's questions. Let me see the QA. So the first question is from Rafael Chiu. The question is chain coded to automatic execute audit. We have tried this. Do you think that this methodology is available? He gives a link. Let me open the link. Oh, it's a paper titled toward secure decentralized and automatic audits with blockchain. Sorry, I haven't read this paper before. So I may need some time to read it before I can answer this question. The second question is, it looks like overall cool is to capture the ledger speed. So then analysts could make sense of the data. But could we analyze this data in real time, in your opinion? I believe we could use. Yes, yes. Actually, our tool is doing it in the real time. But in terms of a very large ledger size, we needed to do some optimization. For example, the incremental processing here. In this case, we can always return the response on the API request in real time.