 All right, you're alive. Vipin, if you want to take it away. Yeah. Hi, everyone. So in our guest speakers for you, we have Dray Mosh. And the talk is all about transforming the edge to a virtual cloud using Hyperledge of Fabric. And over to you, Mosh. So hi, everyone. Thanks for having me here for the meeting and great to be with you. And I'm sharing my email here so somebody is interested in more information or try the approach that I'll be presenting. Feel free to reach out to me afterwards. What Enilog is doing is we are transforming the edge to a virtual cloud. So we know that the edge is a very complicated environment. And I'll talk a little bit about that. What Enilog does, and I'll try to be very detailed to show you the details of the story. What Enilog does, it makes the edge an environment which is very simple and easy to interact with. And to high level, what that means that all the distributed data becomes accessible and available for a single point. So companies or users could interact with their edge data as if it is centralized, although it remains in place distributed at the edge. And at the same time, the same users can interact with all these edge resources, servers, Raspberry Pi, switches, gateways, and sensors as if all those distributed resources are a single machine, which makes it very much like a cloud. You log into some web page. And from there, you could manage all your data and all your resources. The same thing is achievable using this approach that I'll be showing and explaining. So let's talk first about the problem and the challenges at the edge. So maybe the number one problem is the huge amounts of data. And if people that are familiar with data, we know that even simple problems become very complicated when you have a lot of data. Next to it is a problem that next to it is an issue where the data is not in one place. The data is distributed all over the place. And again, we know that there is no efficient technology that allows users to extract insight from data, which is distributed. What companies are forced to do in this scenario is to centralize the data. If we continue in this circle, the next thing to mention is that the resources, the edge resources are very limited. It's not the same servers and the same setup, which you've got at the cloud. You've got different Raspberry Pis, switches, which are not uniform. Connectivity is sometimes lost. So edge resources are lacking or lagging behind what the cloud is able to offer. And then a very common problem, you put some process on some edge nodes, and this process could see local data. But in many cases, to get to the right decision or do the process right, you need to see data, which is not locally on your node. So what do you do then? This becomes complicated. Now you have to transfer the data between nodes and write some complex protocols. This is not something which is simple and easy to do. On top of it, one of the motivations to push data to the edge is the real time being able to shout about quickly when an event happens. And with this setup, with this at the edge where you don't have the resources the way you're used to in the cloud where data is distributed, that becomes very complicated. And I could go on and on with many, many issues and problems, but maybe one thing which is important to notice, the edge doesn't have the data services that's similar to what the cloud is able to offer. At the edge, you're on your own. Not only you have to build the software, but before that, you have to figure out what you do. In many cases, you have to show around many solutions together. And then even if you build something, now you have to deploy and manage it on many, many nodes. Could be thousands of nodes, which creates a very complicated setup. And for those reasons and many others, companies try to minimize their work at the edge and centralize the data. And for the things that they've got to resolve at the edge, if you have to shut some valve, you cannot bring the data to the cloud first. You have to figure out how to do it at the edge. And it's in this scenario becomes a very complicated process. So that's the state of the edge. And what I'll show you is a way using the blockchain, using hyperledger fabric, a way to overcome all those issues. But before I talk on the solution, let me just paint what companies are doing today, how the typical architecture looks like. So you've got edge devices or applications that generate data at the edge. And what companies are doing, first, you have to collect the data, have some nodes at the edge that would host the data. And here, you have to inject a lot of domain knowledge. You have to understand what data is coming in, what are the sensors, what are the devices that send the data. Look at the data beside this data I need, this data I'll drop. This is too much data, so maybe I'll just take summaries or part of the data so you build those solutions. And from there, the data goes upstream and there are more layers, which I'm skipping. But eventually, the data would reach out to the cloud where you host the data in, let's call it domain databases, databases that are built to satisfy the structures and to create the environment where you could connect your applications and from there, you could service the data to the applications. So obviously, this is a complicated setup. It takes months to build those things. By the way, over 70% of edge projects fails and this is just part of the story, right? On top of it, like we said, there are things that you've got to do at the edge or you need to do at the edge, so now you have to push applications to the edge which makes the story more complicated and again, we could go on and on and talk about this architecture, but there is no other way. That's what companies are building and obviously this is project-based. It's hard to build, manage, and scale those setups and those architectures. You end up with multiple software stacks in the different tiers, data is siloed and obviously there is a big impact on the console. But that's the way companies deal with edge data today and let me show you kind of a different approach. So the first thing that I would talk about is in the context of Hyperledger, Hyperledger Fabric, we use it first as a way to deal with the metadata and I'll talk about a process and then you'll see how it fits into the bigger picture. So the way that it works, we've got Hyperledger and we've got nodes that host data and we'll talk about those nodes. And what happens here is the nodes host the data, Hyperledger hosts the metadata, the information about the data. And this could be, we call the information that we write to the blockchain, we call it policies. And if you were to ask me, so what are those policies, that's where it could be anything that is important and you care about. So any log as a platform is using some types of policies but this could be really extended to anything that not only any log cares about but anything that the other applications that are running at the edge would care about. And you'll see how this comes into play but some examples of policies, for example, data distribution is a policy, where the data is, information about members, nodes that participate in this game of hosting data, security policies, configurations, all of those things could live as the metadata which is hosted on the blockchain which is different than the data itself. So again, the data lives on those nodes, those are the blue circles on the left to the blockchain and the metadata is on the blockchain. And in that context, users, applications, nodes would create policies and to create a policy or to save a policy or to add a policy to the metadata, all that you need is to create the policy which is a JSON structure that lists what's the policy is about and then you give the policy to one of the nodes in the network, it doesn't matter which one, this policy would add the information into the blockchain. So from the application point of view, the process is very simple, it's just a REST request where the policy is being added to a node, the node will take it as an API to Hyperledger would put it on the blockchain. But what's interesting is that if something is added to the blockchain, all the other nodes that participate here have a process which continuously synchronize their metadata, their knowledge with what's being added to the blockchain. And by the way, they could also list the things that they care about so they don't need to bring everything but everything that they care about would be available to all those nodes that participate in this process of hosting the data. At the same time, if a node or a user application needs up information, which is of the metadata, it could just reach out to any node because all the nodes are in this model are synchronized and asked for whatever he wants and it would get it quick. What we will focus in this talk is about data distribution policies. So data distribution policies are when a node posts data, it would register on the blockchain a policy that would say, I'm hosting this type of data. Now how this comes into play, you'll see in the next slides but what's important now to remember is that we've got a shared metadata layer which makes all the metadata available to all the nodes. And if a node or a user adds something to the metadata this metadata becomes available to all the nodes that participate here. And we'll see how this functionality is being leveraged in managing the data at the edge. So from here, let's just forget it. We'll get back to the blockchain part and how it's being played. Let's move, let's switch gears and think about the data for a few minutes and then we'll see how all those things come together. So in that sense, we've got things that generate data could be a smart home, got a lot of sensors, lots of things that generate data. What we will do here, we'll take all the data and connect it to a node that hosts any logon it, okay? So data from whatever is generating data here could be the refrigerator or could be a thermal start whatever it is, it would go to a node that would sit at the home. This node would get the data and there is here part of the technology which I will not talk about it in this presentation but if somebody is interested again we'll be happy to share more information offline but the node gets the data and what's important to understand is that it knows in an automated way how to manage the data. So you don't need to send an engineer there. All you have to do is take the data from your Edge devices, bring it to a node at the edge. This node would look at the data would know how to host it locally and store it. Locally all of that is done with full automation but then there are other use cases, right? Think about a car generating a lot of data from a lot of sensors. So again, the same story we could host the any log node at the trunk of the car this node would host the data would host the data locally and again, what's important to remember this is completely automated, no engineers, no domain knowledge. If you remember one of the previous slides it was all about domain knowledge that needs to be brought to the Edge nodes. Here it's just place any connected devices and push the data and then there are so many use cases, right? So with all those use cases just bring the data to nodes and by the way, we don't say where those nodes needs to live it's really up to the user could be on the production floor could be at the local data center and could be on the device itself whatever the use case is. And the claim is now and I'll show you how everything comes together but the claim is now is interesting and it says that there is nothing else that needs to be done from here on the data is available to the applications and what I'll show you and maybe from the previous slide you could start to figure it out but if somebody needs the data so let's say I'm driving a car and I want to find and I live near San Francisco I want to find charging stations available in San Francisco that have a line which is smaller than five minutes of weight I could issue the query and get the result from the Edge node rather than from a centralized database in this model there is a place for a centralized database I'll talk about it but in that context that we're currently talking you get the data from the Edge in real time without intermediaries very efficient in the same way let's say a user or a process wants to deal with energy share it issues a query it sees exactly what it would get exactly what it would get if we were to transfer first the data to some centralized database but here it comes from the Edge and again everything here is done without engineers so what's happening here under the hoods is when a query is being issued we identify the location of the data at the Edge so we could have millions of nodes in this model when a query is issued we identify whether relevant data resides and I'll show you more details of that then it's all a peer-to-peer play we connect the console merge application with the nodes that host the data it's all peer-to-peer it's all done in an automated way and it's in real time and in a very secure way so again I don't have time to talk about the security but it's a more secure model than the cloud which is also an interesting topic to discuss so from here let me show you how what's going on internally and give you a little bit more details and then I'll show you a quick demo of how it works so we get a feel how it looks like so the first thing is let's take the nodes and deploy it wherever data is being generated and again we don't care where the nodes live but each one of those nodes is connected to devices PLCs, sensors, whatever the use case is then there is another layer which we mentioned before which is the blockchain layer which hosts the metadata and between those nodes and the shared metadata layer is this continuous synchronization process where policies are being added and immediately all those policies are available to all the nodes in the network so you could think about all those nodes although they are distributed they're working against the shared metadata layer which makes them operate like one unified machine which is fully synchronized the next thing to say in that context is the data is being generated from edge devices and being pushed into those nodes at the edge and like we said before that's the only thing that needs to be done from here on the data is available and the way that it works an application needs data it takes its query and hands the query to one of the nodes in the network it doesn't matter which one this node may not have any relevant data but it just bring the query to a node in the network this node looks at the query at one hand and it has the shared metadata layer so it knows and if you remember we circled the distribution policies so it understands where the relevant data resides so it could do a lookup to the blockchain where it says which are the nodes that host the data of table X or in case of machinery I want to look at machines whose owner are a particular company or machines of version Y or nodes in a region or whatever if we go back to the example of the car driving to San Francisco the car may say I want to find charging stations in San Francisco so it would identify the nodes that are relevant so again we could have millions of nodes in our network but here this query process identified the relevant node that hosts the relevant data and from here it becomes a peer to peer play like MapReduce the query is sent into those nodes the edge each one of those nodes gets processed the query locally they all return a result to the nodes that the application is connected to where all the results are being aggregated and returned as a unified result to the user or application so that is the basic process and what's interesting is this process is very generic the use case doesn't matter taking to connected cars to smart cities to industrial to robotics it's always the same issue a query the query process would identify the nodes that host the relevant data those nodes are those nodes would process the data locally and return a result to a unified result to the application so the application sends a query and it sees exactly what it would see if it would work against a centralized database and I could also argue I won't go into it but if somebody's interested again we could take it offline or reach out to me this is more this process is more efficient than running against a centralized database this is really what I'm talking here about real-time or near real-time and obviously the data remains at the edge what also is important to say is that the cloud or centralized databases always have a role in the sense that there is some data that has to be in the cloud for other reasons that what we've discussed here or in addition or whatever whatever the reason is so the cloud is like a client to this thing it's like an application where it could query the data it could issue a query this could be a repeatable query it could be based on events it could be any log as integration into Kafka for example and you could put on a rule engine continuously transfer this data this piece of data but the core thing is that you could at least sum of the data and in some cases all the data but it really depends on the use case keep at the edge and that makes the edge very much like a cloud in the sense that you push data in and then you could query the data out and everything inside is kind of hidden and fully, fully automated so you could think about it as extending the cloud to the edge or creating a virtual cloud over the edge and before I show you the demo here is just a summary slide where on the left side you see all the reasons why the cloud is attractive why companies are using the cloud or centralized database in that sense you manage the data from a single point and you could do it in the cloud at the edge you have those distributed nodes that you need to deal with each one individually well this approach from the user point of view from the application point of view all the data can be managed and be available from a single point as if it is centralized although it remains at the edge if you have a unified view of the data even if the data remains at the edge using any load high availability is supported in the cloud but it's missing at the edge we provide high availability again I don't have time to talk about that here but this is highly available it is very secure and obviously the only reason to push processes to the edge is real time, right? You need the real time that's the only not reason but the only advantage of the edge over the cloud well since you're at the edge with any log you get also the benefit of interacting with your data in real time so what I'm trying to say here is look here are all the reasons why you move all your data from the edge to the cloud but we could bring all this functionality to the edge and make the edge appear to you as if it is a centralized database or a cloud and from here let me show you a quick demo so this is a live demo this is not a mock-up or anything like that so if you see those circles those are any log nodes that are deployed all over the world so if you see a node somewhere it is really deployed there and each one of those nodes is connected to PLCs and sensor depending on the use case and even now there is data which is being streamed to all those nodes all over the world and as we said that's the only thing that users needs to do deploy any log connector devices and from here the data is available and to demonstrate that we took Grafana and connected it to the network the network is this collection of node practically speaking the network is a little bit vague term practically speaking we take Grafana connected to one of the nodes in the network it's completely decentralized so it doesn't matter which node so pick a node I think I picked this second one from the left connected Grafana to it and from here we could interact with and see all the data is if it leaves on the node that we're connected to and I could move from one dataset to another and this is all brief time so this will move as we speak Grafana issues the queries as if it is working against a centralized database but there is no centralized database the data is distributed all over the world here on those nodes and you know users could interact I could have thousands of datasets and there is no problem no scaling issues you could just interact with the data as if it leaves on the node that you're connected to and to make it more tangible let's move to this GUI this is just a REST GUI a REST it's like Postman if you're familiar or Curl so I can send requests to nodes in the network and I selected one node so this is the node that I'm connected to so it's one node from those nodes from you know those nodes and I'll send let's maybe send a query to this node so here is an example of the query and I'm sending the query to the network I'm not telling the query where it needs to go to and I'm hitting send and I'm getting the database as if I was working against a centralized database you can see here the data but there is no centralized database what I've got is just those nodes at the edge and if I scroll all the way to the bottom there is a summary section here and you could see the data came back from six nodes what was happening here is we issued the query to one node which was the node that you saw at the top this node looked at the query and looked at the shared metadata and said I don't have the answer but I know who does and it shipped the query to those nodes to six other nodes at the edge that have the data got the result aggregated the result and returned the result into my screen as if it is one physical database although there is no physical database it's all distributed all over those edge nodes so the data came back from six nodes and you could see here the time is listed as zero and the reason is we don't print fraction of seconds so it took less than a second to do that and this is live data which is being adjusted right now and again everything that I'm showing here is done without a single line of code it's just deploy any log on edge nodes configure any log push your data and that's it now you could run queries you don't need domain knowledge you don't need to send engineers you just need to host any log at your edge let me show you maybe this is a cool example by the way those buttons here you could configure it to things frequently used queries and commands that you use so in the same way we could monitor the resources I'll show you a little bit of that but maybe before let me show you here is a different query but this time instead of me running the query I could let you run the query so if I go here to code you could run it via a curl or this QR code so if you'll take your cell phone right now and put it against this QR code you could try here I did and you could see the data came back to my cell phone and if you tried it and you scroll all the way to the bottom you'll see also that the data came back from multiple nodes at the edge and the idea is if you tried it it should work and unless you have some security block and the thing to say here is that we didn't send engineers to your cell phone very easy to integrate it into your applications I mean every thing that I do here you could just place as a rest request to the node maybe one more thing that I'll show you on the data side is videos and images so those are managed in the same way keep the videos distributed obviously think about a store with many, many branches could be thousands of cameras obviously you don't want to centralize all of that and probably even if you want to you can't and even if you will bring the data to the some centralized location what are you going to do there? So here is an example so this is a query on the AI part that looks at video and you could see here the query asks to rank order by the number of people that are visiting the store so you could see zero one going all the way up to five people at some points but then if I want to see the videos why centralize it? Let's just bring the videos that I care about let's say that I want to see those videos and then I could just bring just the things that I care about to my application to where I want to look at it versus let's take the videos from all the stores all the time and centralize it which is very inefficient we all understand the limitations of that and then the next thing that I'll show you is that and again I could go with a lot of details but at the high level the same setup not only manage the data the data is obviously the harder thing the bigger problem but the same setup manage all the resources so again from a single point in the same way that I interacted with data I could monitor all my resources so you could see I could bring to a point very much like at the cloud I could see it and through a single point have alerts so here I think this place is under 50% it slags it as red or if data is not being ingested or whatever your use case is you could monitor all your edge as if it is running through a single point and as if all your edge is a single machine so from a single point you could monitor everything which makes the edge the distributed edge manageable and accessible very much like the cloud so since I'm talking to a group with a big interest in the telco I think there is one more important thing that I want to discuss which is not technical but on the business side which I think is very very important in the context of telco companies so if you're a telco company you're obviously looking for way to leverage the 5G or private network offering and so on and here is the way I mean everything that I showed you here was leveraging the network as the thing that serviced the data which is I think very powerful in the context of telco that wants to to make their services available so if we think about the model today about the role of the telco we see the edge as a place that generate data and the telco plays mostly on the purple side where they service pipes to transfer the data from the edge to the cloud and if you think about what's there so at the edge you would find proprietary software like we said there are no standards there each company builds its own solution and figures out what to do and then the networking is the thing that bridges would create the pipe would take the data from the edge and bring it to the cloud so how about moving to a little bit different model obviously those pipes would always need to be there but there is a place for another model where at the at the edge part it's a platform place so think about any log is something that is being offered as part of the telco infrastructure and what's interesting here is that if it's provided as part of the networking offering no changes are required to the telco infrastructure I mean you saw the demo it just works so it's kind of an easy add home for a telco for a networking proposition but now the telco is also playing not only as a connector or a pipe between the the edge and the cloud but it also is someone that brings a platform and when we talk about the edge data or data that remains at the edge it's the network that connects the connects the data with applications so the telco the networking become the thing that service the data which is obviously a very interesting role and the model here is instead of taking data and copying data from the edge to the cloud and make the cloud providers the ones that service the data this model would always stay but what about a model next to it where some data remains in place and service by the network by the telco companies through the applications and you saw the benefits you saw the value of the users so what I'm trying to say there is a really interesting opportunity for the telcos to consider so be happy to answer any any question and you know if somebody is interested in trying any log or learning more feel free to reach out to me my email is on the screen thanks thanks mosh if participants have questions they can unmute themselves and they can ask and David could you check anything on youtube any questions on youtube yeah sure I'll take a look mosh I have one question about interoperability let's suppose the last slide you showed there we are merging a gen network so let's suppose if we have a different because in telco we have a different telco operators play an important role so how different telco operators network can be merged together to become a virtual data center kind of thing and then can be shared on the network as well yeah so the requirements from the network itself is pretty low there is no restrictions I mean it's all based on peer to peer so there are solutions where messages could cross across network offerings if that's what you're saying and a company could also deploy an overlay network there are a lot of ways to approach it so we don't deal with the network itself and that's the opportunity I think for the telco to say we could bridge between every device that you have and a node that will dose the data and then we create this layer where the data is not just in transition to the cloud but we could allow you to query your data directly from the edge but if network if data will remain in that network only then why we need cloud in that case means if we anyhow we are not sharing data with the cloud it's really up to you so there are reasons to move it to the cloud but it's you know you could say I have a process on Oracle database that needs the data locally I mean it's what we're saying look we could make the edge an environment which is similar to the cloud where you issue a query and get the result and we could also argue that in many many cases it would be more efficient for sure it's more cost effective and we could go on and talk about all the advantages but there are processes that are not related to what I showed and reasons why a company would want to bring data to the cloud and you know and then in those cases this setup makes it even easy because you know we've got all the integration to the cloud like I mentioned and a connector to Kafka you could transfer data to the cloud you don't need to but you may have business or technical reasons because of something that is not related to what I discussed here that says you know this is great but some data needs to to go to the cloud what we're saying even if in that case even if you lower the amount of data that goes to the cloud you could have a big advantage so let's say you used to move 100% let's say 80% of your data you moved to the cloud you could move now 50% or 30% or in some cases like you say you could just rely at the edge huge cost saving obviously the data at the edge is serviced in real time so that's a big advantage we talked about so there is cost there is real time there is ownership issues you could keep everything inside your firewalls there is a lot of value in what we're doing but nevertheless in many cases you would still need to bring data to the cloud and it's not contradicting it just makes it simple to support this requirement if you have such a requirement Thanks Moshe Moshe what is the latency between the query from the edge to one of the distributed centers between the latency between the edge and between when yeah between the when it queries for data what is the latency so you saw in the query example that it was a fraction of a second right so it was less than a second and if we put an overlay network so this was done without any optimization of the networking side what we could do is put an overlay network on top of it and so it would be even more efficient and more predictable and so on but I tried but for sure this is way more efficient then let's take all the so look what companies are doing today because of all the problems that we've discussed in the beginning they say we don't have much choice even if it's great to leave data at the edge there is no solution for it we've got to bring it to the cloud so you first have to take all your data from the edge put it in the cloud and we understand that it doesn't make sense I mean if Google were to say in order to facilitate search let's take all the website and move it to some centralized location we understand it doesn't make sense but companies are doing it with IoT data with edge data because there is no other way and what I showed you now is that there is another way you could keep all your data or some of your data at the edge and make it available and accessible without efforts it's all plug and play so you know you go to the deployment you know deploy any log and then you could do what I've done without engineers without any effort this is something that could change the way companies interact with data and brings an opportunity which wasn't available to make it available but on the time thing for sure this is more efficient then let's take all the data build those huge databases at the cloud and then interact with the data this means that what we're saying you could keep data at the edge and run queries and get the result very efficiently and what's transferred on the network is just the query one side and the results back which is way more efficient then let's move all the data and build those big databases at the cloud so are these databases that are distributed are they just replicas? repositories? so so okay so I'm not sure what you meant by replicas so what I showed you is that in this model you take your data and you push it into nodes at the edge each one of those nodes has different data you've got the data that you know if in the industrial floor so all the machinery that is in this floor would be connected to a node push to this node and then you could have another in this model another industrial or industrial plant which is in a different location so you have a different node there and you could have a node that is in charging station each one of those nodes have different data all right but nevertheless you issue a query it would go to the nodes that have the relevant data so that the process knows how to identify where at the edge the relevant data resize and bring it from there very much like very much like google right like you're doing a google search it would send you to the websites that are relevant for your search here it's a sequel query it would send you to the nodes that have the relevant data there is maybe another question that is kind of implied on high availability so nodes have high availability in the sense that you could deploy such that there are replicas so obviously the data of a particular node is not replicated on all the nodes right but you could say instead of one node that holds the data I want to have two or three nodes that would create replica of each other so you've got islands of replicas such that if you lose a node the query would be routed to one of the surviving nodes and you'll get the high availability okay so it's individual databases at the edge correct that are that you can query from anywhere to that node query from anywhere but they appear to you as a single unified database so applications what they would see when application connects to it actually I could show you what happens when application connects to so when application connects to the network what it sees it's something like that it sees a list of databases for example lights on the intro and for lights on the intro it sees a list of tables okay and then for a particular table it could say give me the list of columns right and then it could issue a query in exactly the same way of that it would work against a centralized database it doesn't care where the data is but as let's say administrator or someone that cares or wants to know I could look at it from a slightly different view so again you see this table that this database lights on the intro and all its tables battery inverter and so on and if you go all the way to the right you could see the physical nodes that host the data okay but the application doesn't care about that they just issue a query and get the result the query is client to the relevant nodes that host the data and the database at the node is dependent on the node capabilities right that is correct okay that is correct but what's also interesting there is that you know this sometimes in con with the conventional approach this is the hard decision you know when we go back to the project you need to figure out how much data is there what how much what what compute power I need to set there where this model is much easier in the sense that it scales horizontally you could say okay and start with this node if because the data is distributed right so if this node is not sufficient let's add another node and another node and you just scale by adding nodes and it doesn't matter that the data is distributed to multiple nodes because the query process like we've seen you know looks at it as a unified collection of data and actually the distribution helps because if you had another node now you added more CPU and all those nodes the nodes that participate in the query process work in parallel so in a way you are increasing the degree of parallelism and putting less data on each node so it just helps you in terms of performance and scaling thank you for the explanation Moshi any more questions if not then thanks thanks everyone for joining today thank Moshi accepting your request and such a amazing presentation thank you very much really appreciate your being here thanks a lot yeah thanks everyone thank you thank you bye thank you