 Hello everyone. My name is Caio Oliveira and I'm going to talk today about industrial IoT. I hope some of you were here in the last session because I'm going to talk about some things that were talked before and also add our own ideas. As I said, I'm Caio Oliveira. I'm a software engineer. I've been making software since 97, around 97, but professionally I'm working since 2007. I work now for Intel, open source technology center. Since in the last two years I've been working with IoT, middleware for IoT, started with smart home, but now I'm looking at industrial. In this talk I'll go a bit through what is industrial IoT and it's our perspective, our team's perspective of it. Then I'll move into the challenges that we have to get there. It's a bit about we see this as an ideal, what are the challenges to get there and I'll go through some of the solutions that we think we need to have and I'll finish with a vision of what we think, hey, this is where we should get. I'll also spend some slides on a technology that we're making to support that vision which is distributed PubSub for IoT. I think in many talks and I'm sharing the keynote, there's a lot of predictions about, hey, this amount, there are billions of devices and these devices are generating a huge amount of data. I like this example because for me it's too much data and makes me wonder how we use that data. I highlighted there's 1 million gigabytes of data produced in a connected factory. So I thought what the ways are we using that data and they're different ways, right? All this data that we are producing. I like to make an analogy with a camera. I like to use the camera example. So some data you grab and use it immediately. Like for example, and this is a true story, I once took a picture of a sign that was far away just because I could zoom it. So I was using that data right now. I collected some data and was using it right now. Sometimes you just store data, choose later. For example, I can take a picture and publish in a social network for my family back in Brazil to see me. Sometimes you don't consume that data raw like in the picture. I just wanted them to see the picture. Sometimes you need processing on this data and sticking to the camera. For example, if I have a camera in a hallway or in a door and I want to just have an alarm triggered when someone goes by. So I'm not consuming all those data. I'm consuming a processed version of the data. Okay, this is the time that someone came in. This is the time someone left. And this example is interesting because it shows that this data processed is informing someone's decision. And we see a lot of that. Like we collect information to make a decision. And I think the next step to be able to consume all the data that we're talking about, we're going to generate in two or three years, is to think how can we make sure that we also can process data and make machines to use that data. So to keep with a camera example, I can have a machine that has a camera and a robot arm. And it will perform some activity in my product. And I use the camera as a way to figure out, okay, this is correct. This is not correct. Another example is in the end of a manufacturing line, you can use images to decide, okay, I can discard this product because it's damaged or I can keep it. So it's not even a person anymore consuming this data. It's another machine. And making sure that we can feed data to machines is what will make possible to get value from all the data we've been talking about here. Aceda worked with Smart Home IoT. So when coming to industrial IoT and what people mean by that, I identified some difference. So first it's a different scale. We are not talking about tens of devices, like you might have in your home, you're talking about tens, thousands of devices. There's also a concern about the, how do I put that? I don't like talking about legacy technology because the technology is being used today. But in fact, in the industrial domain, there are a set of existing machines that you can't replace right now in the short time. The machines there have a longer life cycle. Compare that to, okay, there's this new thing for the Smart Home and I want to use. I just need to change my TV. Sometimes people do that. In factories, it's more costly and more complicated to just remodel your factory. Another difference is that there's a lot more of control, like systems, controlling machines. And you don't have a lot of that in the Smart Home. And with control, you have some other requirements like safety and security. And security is another point here. I want us to stress that in Smart Home, we were okay sometimes with solutions that meant I'll take your data and process in a cloud and then I give you the result back. In factories, sometimes this even is, isn't even possible because of regulations. Sometimes you have data that must stick into your factory. Well, I'm going now to take a look at the challenges that we have. So I'll look deeper in some issues and also in how to take advantage of all the data. The first challenge, and this was something that was talked in earlier, is connectivity. We talked about legacy technology. So I think it was yesterday talk that they mentioned about some turbines that the level of connectivity was, okay, there's a green button and a red button. And you have all those turbines and they're not connected. Sometimes that's the case. You have devices that are manual. Other times you have devices that are actually, they have technology. They have some degree of automation, but you can't, they live in an island. They're not connected to the rest of your factory. So you get some benefits, but it's limited. You still have this island, this islands of connectivity. It's a bit like when I don't have the drawing here, but from the other presentation it was nice to see that they had a drawing with the air gaps. That's a bit of the idea. You see a lot of those air gaps in industrial. Sometimes you have connectivity, but you have it from solutions, from vendors that do everything on the cloud. So that's what they call fragile connectivity because, oh, this device can access this information from this device, but you need to go to the cloud and come back, which sometimes is not good enough. And also there's, this is changing, but there's a low perceived value. We do some, for some factories, is not clear, okay, if I connect these things, what value can I get from it? And this is a bit of chicken and egg because we need people to adopt things to get the real value of this connectivity. And a particular case where connectivity is an issue in a challenge for us is the separation between IT and OT networks. So trying to summarize a very complicated topic, there's the IT technology network, which is about the best, it's a best effort technology. So there's a degree of quality of service that you can offer. And there are the operations technology networks, that's the term we use to describe networks that connect machines that control each other and have deterministic network requirements. So I need to be able to give an answer in five milliseconds or whatever. So these are different networks and sometimes you have island because the technology that we have for IT networks is different and can't give enough guarantees to our machines. This difference is the reason that sometimes we have islands and at least in this case, there is a future solution here. And I'm not sure if you guys are familiar, but there is this set of specifications, TSN, that's being pushed by the industry to make sure that, okay, at least on the base level, you can share the same infrastructure and you open the possibility to connect the IT and OT networks. Another challenge to get to the industrial IoT vision that we have is the interoperability at the semantic level. So I'll use an example here because I think it helps. When I talk about connectivity, I'm talking about, hey, I can call you. So I can put your number and I can talk to you. That's one level. Another level is, okay, you call me, but I don't speak your language. Literally, you are speaking, I don't know, Portuguese and I only speak English. This is the level of the protocols. At this level, for IT networks, you have a pretty good idea of what protocols you can have and you need to support. But if you think about the OT networks and the new things that are appearing for industrial IoT, there's no much agreement at the level. And we have another level, which is, okay, I agree on protocol. We both know how to talk English, but we are talking about different things. And this is the data model. For example, I understand about programming, but I'm talking to you and you're talking about finance. I don't have the concepts necessary to understand what you're talking to me. Because I'm a human, I might try to learn from you, but that's not what machines are going to do right now. They need to be talking about the same data model. They must agree on the semantic level to be able to use the information. And just to give an idea of the different protocols we are talking about, there's this way to classify the protocols that we usually talk about. And they're the northbound protocols, the edge to the cloud. You see MQTT, you see AMQP and others here. They're the east to west protocols that you use to talk between the nodes on your network. And you also see AMQP, you see DDS, MQTT, and also Intel is coming proposing a solution here. And there's a lot of southbound protocols, which are the protocols we use to talk to the machines. And you see Modbus and Adarkat, OPC. These are the protocols that talk directly to the machines. And I'm talking only about protocols here. There's also another layer that we talked about, which is the data model. Okay, I might be able to translate between these protocols, but can I understand that payload that I got? Another challenge that we see is that we must go beyond cloud architectures. The example that I like to use here, it's a simple example, but it explains a bit what are the limitations from a cloud architecture. Imagine that I have a light bulb solution, a light bulb that's smart and it talks to the cloud. So I want to use my cell phone to turn it on, but if I don't have internet, I can't. Depending on the cloud for industrial IoT is not, from our point of view, is not enough. It gives you too much limitations. Even if I have internet connection, both in the light bulb and in my phone, I still have a latency because if the control is going through the cloud and coming back, maybe it takes two, three seconds for the light to come to light up. And that latency is another issue that in many use cases for industrial is, it isn't compatible. And finally, there's the security aspect. So, okay, connectivity is fast or I don't care about taking two or three seconds. And I want to light the light. Great. But do I need, do I really need to send that information? Okay, I'm turning my light on to the server and come back. The example of the light is very simple, but you can see that we could do better. And these limitations, if you turn them into features, okay, I can have support lower latency. I don't need to depend on the, on internet. And I can provide data only to the nodes that need. This is mostly what open fog architectures are about. So, if you heard about edge architecture and fog architecture, in a very simplified way, this is what they are talking about. Okay. So, there are other challenges, of course. We choose to focus on these ones because they help us to build the picture and the vision of industrial IoT. So, now I'll start with a diagram and we'll work with it together using pieces of it, I hope, in the previous slides. So, it will be easy to understand. This is what model of an architecture that we have today. You have at the top the cloud and computation happening there and storage happening there. This cloud is connected to two sites that I have. Can you guys see this? No. Okay. These two sites could be two different factories. They're not connected directly at this point. And so, they operate in separation. The gray box there represents, it's a logical representation, okay, all the data that we are talking about that is going to be transmitted, it's there in this box. And you see that the red devices produce something using a data model A and the yellow devices produce something using the data model B. So, if I have an interface here like this one on the left, it must understand all these different data models. So, different protocols and different data models. Also, you have connectivity to the cloud that might need another layer of translation to make things useful for the compute. This drawing also shows the islands like the green parts. They're not connected. They have technology involved. So, you have an interface node using a data model, but they're not connected to the rest of your network. And on the right side, there's an example, the purple, which shows a node that knows how to talk to the IT network, but talks to devices that need a deterministic network. So, it's an illustration of the ITLT separation there. You see that those nodes on the bottom, they require some kind of TSN technology. So, they can't be connected to the other part. So, the first thing that we think is part of the solution is to provide a common language for the machines. So, the difference between the previous slide, the previous diagram and this one is that now we're pushing into the machines to talk the same language, use the same data model. Sometimes, this means that you need to put a bridge between your network and the rest of the devices, like the example in red. So, the red devices are still using, let's say, OPC protocol and the OPC data models, but we have a bridge and publish it in a way that the other nodes can consume. This kind of translation was going to happen anyway, because you don't want to put the OPC away data through the cloud for computing. What we are doing here is to more and more make the machines use the same language to enable other nodes to take advantage. In this diagram, you see that the interface that I had, the HMI that I had here, now doesn't need to talk many things. It can stick to the blue data model and protocol. Oh, sorry. I forgot something. There's an open source project that aims to provide exactly that, which is IOTVT. It implements the specifications from OCF, the open connectivity framework, foundation, sorry. And part of these specifications include a way to do bridging between technologies that are relevant in these domains. Elaborating more on our diagram, we have the goal of connecting more than networks. So let's bridge, if we take, I'll go back. Remember the Green Island? We're putting a bridge there and connecting it. And if we just look locally, you say, okay, you put the bridge there and you put the HMI there, you're just going through another path, but I get the same things, right? And, well, it depends because I had another Green Island on the other plant. And if we are more connected, you can also look at the data from that other Green Island. And that's what we mean by more connected networks. Let's ensure that you have this free flow of information between the machines so you can do more interesting things. And then there's the next step that I mentioned before, converging IT and OT networks. And this is where we start to think about software-defined infrastructure. If you compare the previous slide, there's some things to interact with the purple boxes. I needed to have a specific device that talked into the deterministic network. If I can manage to integrate those networks, I can have a node that control this. And we could even virtualize this node. You don't need to have a specific box that will talk to that specific device. You can have a more general box that has IO capabilities and that you can put software to talk to that device. If you need to change, you can change it. And the last step, as you can imagine, is to bring more compute into your fabric network, into your local network. So the same kind of computation that we want to do on the cloud, we should be able to do locally. And today, for the cloud, using OpenStack or Intel's Chow scheduler, you have ways to say, okay, I have these applications and they need to run with certain affinities, for example. I have application that must run near my database. And I have two applications that one is generating data for another one, so I want to run them in the same data center, one near each other to take advantage. These same affinities could be extended to say things like, okay, I have this application and it must run near this specific IO that I have. Or I have this application and it has latency requirements to produce data to another application. So I want to run this in my plan. And, well, that's no difference because at this point, this is what we envision. So the ability to take compute near the edge is one important thing. And another important thing is to ensure that these nodes, they can really understand each other. And the data being produced by the machines can be consumed by the machines without a lot of intervention from us. And that's where we want to get. How are we getting there? And that's a part that I have a good news and somewhat challenging news. The good news is there's a lot of open source software and open standards either already working or being developed to fix the parts of the challenges that we talked about. So I'll mention some. There's the effort for TSN and Avenue Alliance to converge IT and OT networks. And the Avenue Alliance that's steering this work provides an open source stack used to call OpenAVB and now we call Open Avenue. So to enable this this kind of technology. There's also OCF, which I mentioned. The goal is basically, okay, let's figure out what are the protocols and what is the data model that those machines should be talking. And there's IoTivity, which is the open source implementation of OCF. On another level, the level of how the machines will call each other, we have different protocols, DDS and DPS from Intel that I'm going to talk briefly about. For the managing this network and managing the applications where each thing should work, should run, you have OpenStack, you have Ciao, you have Kubernetes. There's a set of solutions there. They're not necessarily ready for this kind of fog architecture, but there certainly could be extended for that. And okay, we have some idea of where the pieces will come from. There's also some integrating work to be done. And at least I see that work being done in part by OpenFog, trying to define, okay, this is a reference architecture and that's how the pieces talk to each other. And also in OCF, there's a test group that's dedicated to specifically industrial concerns, so trying to fill the gaps that are here, that we identified here. And the challenging news is that this takes a lot of effort. Integrating everything is complicated, so we could have a very good solution for a particular thing, but making sure that everything works together is a challenge. So that's a challenge, which is good. I'll talk now about one of the pieces that Intel just released and is working on. It's still in alpha stage, but it's part of our proposed solution. In the diagrams, I showed this data plane, which is the gray box where everything was connected. This is a logical representation to highlight that we want a free flow information between those nodes. We want, regardless of how I connected that node, if it has the authorization to get to the data, it should. So it's a way to abstract, okay, we'll have some technology with, we'll deal with the data plane. And that's what DPS is going to be about. It's a distributed PubSub for IoT and we just release it as open source. The main developer is Greg. He's here at the conference, if you're interested. I'm working on that software too, if you want to talk about it. It's a fully distributed PubSub solution. And by that I mean it's brokerless. There is no broker node. And this is an important aspect for DPS because it means that we don't have a single point of failure. If we have a way to get the information through, we'll get it. And that's a nice feature to have in the industry environment. It supports lightweight publishers so you can have a very small sensor producing data for it and supports data series acknowledgments, retain delivery, which is basically, okay, if I have a node that it's a very small sensor, it will just produce some data, dump it and go to sleep until there's until it can produce more data. This data goes to network, but it can be retained from other nodes in the middle. So a consumer doesn't depend on being in exact synchronization with this producer. It also supports security, both at node-to-node communication, but also at end-to-end communication. So we talk, okay, every data is there. If anyone taps into that connection, we see everything. Not necessarily because the fact that you are on the data plane doesn't mean that you have the authorization to read all the data. So the project is on GitHub, and I invite you guys to see it if you're interested. And that's what I had for today. If you have questions, please talk to me during the conference or shoot an email. And if you haven't chosen talks to go, later today I recommend kitchen talk, which is going to be very interesting question. Yes and no. Yes, it's a PubSub, and you have many features that you have in MQTT, like you have topics you can subscribe using wildcards. The no part is that it doesn't rely on a central broker. So in a sense, every node can act as routing the packages. Yeah. Any more questions? Well, not today, but if you're interested, we can talk about DPS and I hope the next conference will be demoing much more of it. Okay, so thank you everyone for the time and have a good lunch.