 Hello and good morning everyone. I hope you can hear me a lot. It's fine. And I would like to also wish you a good evening to our friends in Asia. I have a little bit later time right now. Our talk today will discuss Hyperledger Fabric Security Monitoring based on Hyperledger Explorer. So these are two Hyperledger projects that we connected and used to monitor the security of Hyperledger Fabric. So a little brief overview of our talk that we'll be giving today. So with me is my colleague Fabian Böhm, who is also a researcher with me at our department. And we will first give brief motivations. Why should you actually care about Hyperledger Fabric Security and then go in some more detail on background on Hyperledger Fabric and Hyperledger Explorer before detailing our security monitoring architecture. So how we built the security monitoring for Hyperledger Fabric. And then we will go into even more detail on the processing pipeline. So how about data, what did we collect, how did we process it and how did we visualize it at the end. And then we will give a brief live demo of the Hypercycle prototype that we built. At the end we will go into a question and answer session as a customer for all presentations here. So why should you actually care about Hyperledger Fabric Security? Fabric is trusted to provide security for a lot of critical applications. So for example, there is the trade lens platform that I'm sure most of you have heard of. Then there's other platforms like METRAID, OpenIDL, etc. who are all using Hyperledger Fabric in production right now. And for this reason it is quite important that Hyperledger Fabric is a secure system and it can't be hacked easily. Especially since blockchain has reputation that it can't be hacked among many organizations, which is obviously not the case. And you should never take it for guarantee that blockchain can't be hacked. Because like any software, Hyperledger Fabric is vulnerable to bugs and exploits and of course also to denial of service attacks. Another important aspect to keep in mind is that the independent operation of these Hyperledger Fabric nodes could potentially also increase the attack surface. So for example, if one of the operators has not configured the node properly, then it might be easier for an attacker to get access to that node even though you have configured your own organization's node correctly. So this is also something to keep in mind with blockchain security because you have this distributed system that many organizations operate together. So yeah, I want to give a brief overview of attacks that are known for Hyperledger Fabric. There is a precondition for this. So many of these attacks require that you at least have some kind of access to the network. So for example, you can send transactions to the network that you can maybe access an API that has access to the network. So you need to have a little bit of access, but not for every attack of course. So for example, network-based now service attack will also work if you don't have any credentials for the network. But for example, to abuse a chaincode bug, you of course need to be able to send a transaction so to have a valid identity in the network and to be able to sign transactions. And regarding chaincode bugs, there was also a great presentation at Hyperledger Global Forum two years ago by Chain Security who went to detail on some possible chaincode bugs. And yeah, there's also other topics that could potentially compromise the state database. For example, per default Hyperledger Fabric, if using a Couch database or CouchDB, it has an unsafe default configuration where anybody can access the state database or even modify it. So you should make sure that that is not configured on the system or you would expose yourself to potentially an accurate worst data in your Hyperledger Fabric instance. And in addition, there's also the problem that we do not have a Byzantine fault-time consensus algorithm so you need to basically trust your network participants if you're using Kafka or RAV consensus. I think there is one in the works currently at IBM that is supposed to become the future standard for Hyperledger Fabric but it's not in the documentation right now, for example. And I think there's also some custom implementations based on researchers but nothing official yet unfortunately. And of course there's also the issue of credential theft or compromise that could also happen which is always something to keep in mind. If you notice compromise, then you could of course use confidential data potentially because somebody else could get access to the blockchain data. So how can we support security analysts in detecting these attacks or in getting a better overview of what's happening actually on the network? The issue of course is that each organization only has access to its own key and only a limited view of the network and that you need to monitor both your own host and the network data at the same time. But at least some kind of idea of what's going on on the network. In addition, each node has some very different data sources. So we have for example the log data and then you have metric data and the actual blockchain data which you only need to aggregate in some way in order to get a good understanding of what's going on. And of course the large volume of blockchain data which depends on your implementation. Of course how many transactions you have on the network but it might be a bit difficult to keep track of what's going on if you have for example 100 transactions per second on your network. And then we don't really have any automated systems yet for tech detection on Hyperledge Fabric unfortunately. So we kind of need the human domain knowledge as well for security analysis. So we need someone who knows how does the text work in order to be able to identify the text properly. So to be able to assist the humans or the security analysts properly we have to assign some visualizations and integrate them with Hyperledge Explorer where we can detect the text. So before I go into the architecture I want to give you a brief background on Hyperledge Fabric and Explorer just briefly for those who have not had much contact with them but I assume there will not be many here that haven't had at least some contact with Hyperledge Fabric probably fewer than no Hyperledge Explorer but I will briefly expand that as well. So the Hyperledge Fabric architecture has many different data sources as I already mentioned. There's the ordering service, there's the different peers who are each running maybe different chain codes that are used for different transactions and then they have different login levels so you can configure if you want info or warning or error log levels in order to be able to get as much information as possible you should definitely configure info log levels for example and on top of that you have the standard STK APIs where you can get the transaction subscriptions and the log data and then you also have the metrics that are made available via Prometheus or ST but we're using Prometheus for our project. And for Hyperledge Explorer well it has a basic sync architecture so it swings all the blockchain data into a relational SQL database so Postgres SQL in this case and it runs basically a sync service it has a chance of subscription and stores all the blocks and transactions in its own database also the chain codes into different peers and everything so this is perfect for analysis so that we can analyze all the data on the blockchain quickly to run queries on it which isn't possible with the standard STKs. So that's basically the architecture then you also have the presentation layer on top which accesses the backend API the backend is running in Node.js and the front end is React so we also got a little bit later in the presentation and the front end basically just displays the data and runs some of the civilizations that we also added so the security monitoring architecture this is the architecture that we chose or that we went with basically the top part is mostly unmodified from standard Hyperledge Explorer so the Postgres SQL database was just added we just added some fields there to support the additional data that we were using the blockchain transaction data is also the same from standard documentation and the basic front-end architecture with React and Redux is also mostly unchanged but we added some new dependencies for our civilizations of course which Fabin will explain in more detail so yeah we also added access to Docker API into Prometoise we used a reverse proxy to proxy the requests to Docker into Prometoise to get the metrics and logs that we need to display in the front-end for our analysts so the basic flow is of course data collection first then the preprocessing which happens in the backend mostly and then the presentation in the front-end where the data is again to some degree processed so we can display it in the visualization but most of the more complicated processing happens in the server side in order to not bother the client with it so here's an overview of what we changed actually from the original code base and you can also check it out in our GitHub repository which is linked on the slides most of our changes in terms of code changes were of course in the front-end because the visualizations take out a lot of code we also made some backend changes in order to support the additional transaction properties that we want to show for example transaction size and the actual identity which is data that is not available in the standard hyperlature explorer implementation so we added that then we added some notifications for conflict transactions because if somebody sends a malicious conflict transaction they could probably destroy the network in the worst case so you should probably know about these transactions and of course potentially approve or deny them if you have the appropriate permissions for that and of course for the rest services we added the reverse proxy routes that I mentioned so in total that was about 1,000 lines of code that we had to adjust for that for the front-end we added all of the new files for the visualizations we added some new lists to support the same issues from the hyperlature fabric gyro so we always want to know about the data security issues of course on hyperlature fabric we can see if potentially somebody is exploiting these security issues on our network so that's something that we display on the dashboard right away in addition we also have some modifications to some existing news in order to make them a bit more concise to give a better overview of the network and we also added of course connectors to the new backend API routes which is something we usually do with the redux in the front-end so that's it for my part I will now hand over to Fabian who is going to give an overview of the processing pipeline and give a live demo of our prototype all right thanks I think you should move the new Benedict because we are getting an echo awesome also hi everyone take a look at the data collection Benedict just mentioned our Hypersec tools collect some additional security relevant data beyond the data collected by the original Hyperlegix for a project for example get an idea of the current performance of the network with respect to transaction throughput look at some additional prometoise metrics so just take a look at how the transaction is proposed and when it's on the change the duration between that we also collect GRPC metrics giving us insight into the network activity network status, network communication between nodes, between peers and besides these metrics and data directly related to Hyperlegia Fabric Hypersec also is able to access from local docker containers where the fabric network is running we also have access to the Hyperlegia Chirera issues security tags security relevant Hyperlegia Chirera issues and also have a static scanner chain code and during our work together all this data getting involved together we come across some missing or desirable metrics or functionalities for example also in transactions would be really nice because that would allow easier to detect transaction spam attacks when you go for all this data that we have available we need to pre-process all this data before especially important when we are working with the transaction data to get a possibly large number of data we need to display when we take a look at that so this is relevant in the transaction view when we get all this data together and in the context of interactable visualizations like a large number or large quantity of data poses a real challenge and thus we need an efficient way to handle this data in too much detail and with the data binning data bucketing we implement data aggregation approach into this transaction view so originally with the hud ledger explorer users specify a time range start in an end time stamp then they want you to add the transactions so we take a look at the pre-processing on this slide besides this already leading to large data quantities because we maybe get a lot of transactions for this time range we include the additional metrics from it so it's into the existing view so this leads us to even more data so we look at what can we do to aggregate the data in here so in our adaptive transaction users can also specify their preferred step size for the data buckets so with the smaller step sizes the more detail you can get from the data buckets or from the visualizations but also you get more data buckets when the steps the visualizations might respond slower and based on this we build our empty data buckets and form is a hash map where each data bucket has an ID starting time stamp with respect to transactions time range the bucket is and the time stamp for the first bucket is just to start user input and the idea of the head is calculated by dividing the time stamp by the step size and then rounded to the smallest integer hand stamps of the following buckets is always just the sum of the previous buckets time stamp plus the user input step size and this hash map now allows a very efficient population and we can can we exactly calculate the bucket for each transaction or metric where it needs to go to therefore we take the time stamp of the event minus the start user input so this is kind of a normalization we need to do and divide it again by the step size rounded to the smallest integer and this gives us the ID or the hash of the respective bucket where this transaction or metric data needs to go to so we can very efficiently access the data as we do not need to iterate we need the data for a specific but we can just calculate the hash at this point in time is related to and then data with a runtime of one okay that's it for that slide Benedict if you just one please thanks so we are using well established libraries on our projects to visualize them this data so as Benedict mentions Explorer's UI is originated with React we just stay with React of course and a very well established library to implement it's influence interactive changes three chairs the state of driven documents well this is starting code often it's just a huge spaghetti but and also one other problem of the three is it doesn't integrate the well with react but in 2020 we introduced a pretty cool low level visualization test based on D3 but very well integrated and combined with the functionality of react and still if necessary all the underlying D3 functionalities are still accessible so we mainly worked on the network view of hyperlatch explorer and the transaction view the main changes on the network view you can see bottom left side of the slide we extended this with a node link graph the graph shows local nodes and also external nodes or peers beyond these two orders so these are filled nodes you can see and this kind of mark the order of our local organization and also we can highlight high frequency links based on GRPC metrics like the GRPC metrics are unusual high frequent or low frequent we can highlight this second main focus or our second focus was the transaction view about a right of this slide where we can explore information about the transactions on the chains we added four different visualizations we have this first chart just displaying the number of transactions within each data bucket that I explained earlier and then we have three more granular charts on the left we have transaction count per MSP per bucket in the middle we have the average transaction size per MSP per bucket and on the left and on the right we have the average processing times of transactions and on the bottom is the part of the original hyperlasher explorer table with additional detailed info about the transactions now this is all working together and how it's looking interactively we just take a look at a quick video a quick live demo so we start with our little overview at the main landing page of the explorer so this is the dashboard we get some high level data about our small test network consisting only of two peers besides all this other data we have the hyperlasher chiro hyperlasher fabric chiro issues including some details the descriptions and also a direct link pointing to the issue next we'll take a look at the network view which gives us more information about the active peers in the network so we have this graph with the details we know about the current network meaning the nodes in our organization which is one peer, one order marked by these two different clues we have the two border nodes and we have one external node from another MSP we have some basic information about the nodes on hover interaction also some basic metrics about the communication between our nodes when hovering on the links and we have this table below the graph where we have direct access to the Docker containers logs of our local organizations nodes so this is what we can see here just the Docker container logs so next we'll look at the hypersex transaction view so we have three main parts of this view of the search parameters at the top we have the interactive charts to explore available information about the transactions and we have then a bunch of details each row is a transaction so we can start selecting a start date from which point on we want to explore the data in the topmost chart we can now see two buckets which seem to have a bit more activity than the rest of the buckets we can see the time window in which these buckets are and respectively change our search parameters to this time window time window is now going to be just around three hours small and with the small time window we can also change to a more fine granular aggregation so we can change here the bucket size and minute buckets and then we have a pretty frequent pattern but two signs that are changing the pattern so we have one MSP which published a bit more transactions here we can filter the MSPs here on the views and also then this is somehow showcasing and highlighted up here is that the endorse the proposal metric so the transaction duration here increases significantly significantly to more than 300 milliseconds with the proposition of these transactions the table at the bottom now shows the transactions related to the currently selected time selection and also gives some more details also the direct link to the transaction so that's it just for a quick video I guess that's it about our hyposec video we can go on the last slide now the tool or the current research prototype is available on github check it out there we are currently experimenting with some different views different data processing approaches and I guess now it's just up to me to say thanks everybody for the attention and we are now happy to answer your questions there's already one question in the Q&A so Norak Roy asks how is the Docker container connected and how will the integration differ if your peers are running on Kubernetes so maybe that's something you could answer Benedict so basically we are accessing only the local Docker container we have actually not tried it with Kubernetes that is something you would have to experiment with on your own but we are using a standard load package for accessing Docker so you probably just have to change the URI and it should work hopefully the most important part is that you actually have access to the Docker API and then you can just access it by putting in the proper URI and the code and then the backend is going to forward the request to the Docker and then you get the logs in the front end again and of course you are only going to be able to view your own nodes logs and not the ones of the other organizations because you might not have permission for that okay so if there are no other questions I could also go again in a bit more detail on some potential for improvement that we noticed in the Hyperledge Collaborate ecosystem on Summit 13 so there are some missing metrics that could potentially be added as well for example on outstanding transactions discarded blocks or failure elections which is something that is very interesting if you are trying to detect an attack and also regarding vulnerabilities so there is of course a need to have proper threat intelligence by version and there is no really good source for that currently and you also want to have chain code scanners for the chain code in other languages so we used one that we used for go but there is no proper chain code scanners for example for JavaScript or something so those are all the development needs that the community could potentially fill I see we have some more questions Tayamul is asking are you querying fabric for data or collecting in real time so how Hyperledge Explorer works it has a real time sync connection to Hyperledge of fabric so each block arrives at Hyperledge Explorer as it is published on the network so you have almost real time or near real time I would say connection to the network and you also for example if a config transaction is published you should get very shortly after that sending some side processing delay get a config notification in your user interface so we are not doing on-demand queries but we instead are syncing to the network live and then querying our sync database so besides the sync delay there is not much delay there so it would be perhaps some seconds but not more than that and another question by Soma Roy again he is asking about the channels so in the channel view we actually didn't show that in the prototype and the demo but I can quickly go back to the slide you can see that at the top there is a channels tab in Hyperledge Explorer so you can actually see the different channels there and you can also select the channel that you want to view the transactions on so Hyperledge Explorer can store blogs and transactions for all channels that you have access to with your identity so you can store it for multiple channels of course you need to have access so if the channel is encrypted for you then Hyperledge Explorer can really analyze much so that's the practice of course ok so I think we have reached the end of our time if you have any more questions feel free to contact us you can reach us of course on github or using our email addresses if you just search for our name and university of Regensburg you should easily find it on google for example and yeah so we would be very happy if somebody wants to for example contribute to Hypercell on github and we are always very open and welcoming to contributions there and we of course always try to contribute our own development back to the original project so we have made some full requests to Hyperledge Explorer to improve the project and are of course going to continue to do so as we have improvements in our prototype that are relevant for the new project so we are very happy of course to do that and yeah that's it for our presentation feel free to contact us and have a nice experience at the conference