 Let's talk not about XMR or Monero, but rather about Mirai and computer vision and the security of connected webcams and the embedded devices that go with it. That's about it, I think. Let's explain it better than me. Okay, that's right. So you can call me Michael. My name is Michael Schlofenben. It's a really long name. And it's because it's a German name. I spend a lot of time in Germany. In Munich, for example, I have a laboratory, a research facility, and I'm building a small fabrication facility in Munich at the time. I have another office in San Francisco, and I travel quite a lot because I do workshops. And the intention today was to have a hands-on workshop. So I see there are some laptops out. You're welcome to follow along with things that I will type later on. But from the four-hour requirement, we have so little time that it's going to be mostly demonstrative. It's going to be mostly demonstration. And so we're going to be taking a look at these topics and have as many demonstrations as possible with very little hands-on time, unfortunately. So let's, I forgot to get my laser here. There's a few different things that we're going to focus on. We're going to intentionally de-emphasize the cloud portion because it's so little open source. There's not very much transparency in typical cloud providers and computing machine learning things like IBM Watson and so on. So we're going to de-emphasize that a bit and try to keep inside the edges of the network and do everything locally with projects like, well, for example, OpenCV and Sphinx, Python, and these kind of types of things. And let's just get started then. This is a typical how a network typically looks when you're looking at it from the IoT or low-power device perspective, where you're possibly not always connected. You have an intermittent connectivity flow. So you do have some gateways that are sending your data out to maybe the cloud on the very end or other connected computers over data centers. But these are your devices here, your device profiles. And in some cases you might have the same thing on the other side. This is just a very simplistic version of what you might find. But we'll be looking at, for example, microphones, webcams, and trying to understand how the processing works at the edge, mostly today. So these are the typical computers that we use in a workshop. This is what we would have if we all had nice tables, if there was a power socket at each table. If we had a stable wireless network and an Ethernet network, then we could do things like with these new unit, what does NUC stand for? New unit computer, next unit computing. These are the typical things that we have in our workshop that we travel around with. And if you're expecting a four-hour workshop today, we won't have that. We don't have the facilities for it. But you can always contact me after the workshop. I'll put some cards up here. If you have a private question or a question you didn't get to, or if you'd like to try to schedule a workshop in your university or something, find that a good idea. I'll get the cards and put them up in front, or if you have non-relevant questions like the hardware wallets that I just passed out a while ago. So I'll try to have some. I don't have new cards, but I have old ones from Black Hat. So I'll put those over there. Those are the type of computers we usually use in the workshop. Today we're going to stick to the edges and everything inside the network, so do local processing, which we can do, well, demonstrate, at least on a notebook. So for computer vision, at least in the open source world, OpenCV rules the day. This is what we'll be using today with the time we have. This is an overview of maybe some of the history. It was a project started in 2009. No, 1999. I knew there was a nine there. And it grew slowly at first. It was just an Intel-based laboratory project until it kind of gained traction and was adopted by the open source community, in which has kind of overtaken the maintenance and support, until today it's quite a large project, OpenCV. It's relatively well known. And, yeah, it has a large and active community. There's a couple URLs there for anyone who wants to know details that they're very useful. In particular, there's a lot more than this. We'll get to that later and we'll have a couple demonstrations. These are the use cases. So you might have robots trying to follow a line or anything that involves vision, controlling door access if the door is open or closed. For example, what, I don't know, estimating how long a suitcase has been without its owner in a train station or how fast traffic is crossing a line. These are all use cases, very good ones for OpenCV. And what else do we have there? Motion tracking and so on. There's a couple more exotic use cases. Like if you build a robot which looks like a human, you enter a club and then you're trying to get a view of all the androids, criminals in the club, then you do something like this. I don't know if you caught that joke. So this is the terminator. So the terminator had computer vision in his eyes. This is a couple more typical examples, edge detection and finding the nodes of objects in order to play chess or whatever you have. And this is a typical Python program for OpenCV. We'll do a C++1. I think, let's do that now. In fact, we have a few more slides, but I'm going to blow past them because we just don't have time for that. It's obviously computer intensive. And here is, this is not good style, you know, like smushing lines together and using if conditions with no brackets and so on. But I did this so that it was easier to pack the text on one slide so we can see it in large text. And it's really not that difficult to understand. I hope the tabs worked out. So this is a basic kind of Hello World project for OpenCV. We're just displaying a file, which is a bitmap file. And once I just copy that over, let's see what happens if I try to demonstrate it. Let's try OpenCV, not Sphinxlib. So I'll make this a bit bigger. It's not on the screen because I guess I have to hit, what, escape? Okay, I guess, yeah. Then let me see. The best way to fix that is probably do this. That looks better, doesn't it? Okay, so, and then that means I can't see it, but you can. Okay, so then I will figure it out. So I can't see anything. Just tell me if I'm typing something crazy, right? So it's probably like display. No. So that's it. That's what we had before. And if I, let's see, is there a make file here? Not yet, but I can make one. If I make a build it directory, I go into build it. You see, let me just go back. What we had before here was a CMakeList, I believe. Is that right? And then if I take a look in there, there's some definitions because it's a CMake project. The OpenCV developers, they really like CMakes. So they make it easy for us to use. That's what it looks like. That's defining our project. And there's nothing here inside the current directory. So if I do CMake and then I give the location of the CMakeList file, which defines the project, then basically CMake will start working and it will, it will, oops, it will, let me just clear that. And then it will give us these files here. So remember before I listed the directory, it was empty. Now we have all of these things. And our make file, I don't know how long it is. It's a bit too long to show. Sorry. But what that means is I can type make and then it will make the project. How fast did it go? About 10 seconds. Let me, why is that cut off? I can't really, maybe if I can, it's not very nice to have it cut off. Something like that is maybe better. And after building, we now have, from that C++ code, we have a, what is it called? Display, something like that. Yeah. And then I have prepared a smiley something, smiley crazy image. So this is basically just showing the image, right? I mean, this is a really stupid program, but it's a Hello World equivalent. So we know, for example, the CMake library is working, the CMake project generation is working. We know that we have OpenCV correctly installed and all of that. Anyway, so then I guess, how do I get out of there, escape? So then what do I do now if I want to go back to slides? Okay, there's the mouse. Oh, okay. This is really confusing with these slides. So we're about halfway through. We went there and there. And we talked about OpenCV use cases. We did a demonstration, C++. We saw the Python equivalent code for that as well. And usually we do this in a laboratory environment in a workshop. But there's no time for that today. All right, so this is back to the conventional network diagram that we have from the IoT perspective. What else we could do with a camera if we had a camera attached, which maybe we'll get to an OpenCV demonstration with a camera so that we can see, detect a smile, for example. So there is a microphone on most cameras. So let me just see how that might work. Voice to Text Recognition is one of the use cases that we can do with, not OpenCV, but with generic IoT edge processing for things like microphones and cameras. This is the history that we have seen with the type of recognition on the edge over the years. Since 1950 we've seen progress and we're getting pretty close to very full vocabulary systems today, even on the edge. One of the projects that is open source friendly is called Sphinx. So I'm not sure what CMU stands for. Is that Carnegie Mellon University probably? I'm making a pretty wild guess there. I could stand for something else. This is CMU Sphinx. I've been using it quite a bit and it's very stable. It works in Python and, well, you can see that it uses kind of a neural network type of processing scheme. So let's just go past this deep learning and let's do another demonstration for that. These are all things that could be happening on IoT devices like in cars, drones, and so on, right? So actually, I've already got this set up. Alright, so I can't see this either. But because of that I'm using Python modules for this, I can show what I've installed beforehand because we wouldn't have time to do it otherwise with the unstable network. And the ones that are most important are, let's see, speech recognition, for example, is here. I can't find, well, I can't find the mouse. So we have speech recognition. There is a pie. It's above. Pie audio is important. Pocket Sphinx is the actual library that does the text analysis. So I'm just going to Python and let's do the one. Let's do this one. So this is what we will do. And I hope it's going to use the correct microphone. So let's see what happens. Hello, Faustim. We are in Brussels. So now it's recognizing takes two and a half seconds. Hello, Faustim. Hello, Faustim. We're in Brussels. Okay, so Faustim is obviously not in his vocabulary. Because it's not in the dictionary. Right, exactly. So the one thing that's interesting, I can show the source code in just a minute. It's actually doing two different types of recognition. In this case, it's doing an edge-based recognition, which is all local. It's using the Sphinx library. And it took two and a half seconds to detect this quality of text with this resolution. Hello, Faustim. And we're in Brussels. So it got a few things wrong. The next one, it used cloud processing. So it used 50,000 computers in Iceland and everywhere else. And in a less-than-open-source manner. However, it did a slightly better job. In most cases, it gets close to 100%. Things like Faustim, I guess it won't. Catch that, but it does it a bit faster as well. Even with this, I'm using the same network as everyone else. So it's quite unstable. And then you're probably wondering what this looks like. Oops, I just... I just wiggled the cable, unfortunately. Oh, yeah. Oh, that's the one that came unplugged. So, intermission time. How are we doing with the hardware wallets? Has anyone stored funds on the... I accept tips. Somebody answered before me, so... Okay, yeah, you're probably wondering what the source code looks like. Let's take a look. So, yeah, this is not too easy. Right, so this is a bit longer. That's why I didn't put it on a slide. It's maybe 100 lines long. But it's only long because I'm using the two different kinds of analysis, right? The first thing we're doing is with the Sphinx library. So we're setting up the microphone. And we just do a listen for five minutes in order to gather benchmarks of the room and ambient noise, basically, so that we have a baseline of what's speech and what's not. Because you could use this, yeah, in traffic or in a silent car, all these different situations, right? So after we get the baseline done, we print this here, say something, and then we start listening for five seconds, I believe. And after that, we start with the Sphinx code and the logic. So that's... Where is it? Try, print, excuse me. Right, so that's here. You can see that at the end is a recognition logic. So this is quite a useful library. I do recommend it, especially here in open source context, where a lot of people enjoy Python. And then just a little bit below, you might be interested to see how that works with the GCP, the Google Cloud Platform, that starts here. And we're taking a timestamp there so that we know how long it took. And this is here, the important... Oops, not there. One line up, I guess. Yeah, there it is. So this is the one where we do recognize audio of using the Google Platform and it's going to tell us what it thought that we spoke. Yeah, all right. So the only thing left to do... Hello. Anyone see the mouse? Yeah, the tab is not doing too well. There we go. All right, so this is typically... This is what we cover when we have four hours. We take a look at the Google IoT Core, what type of services they have, the same thing with AWS. That's the console that you see typically with the GCP, the Google Cloud Platform. This is... Where is AWS? AWS is, I think I skipped it accidentally. All right, so this is the structure or the architecture of a typical AWS flow where you have Lambda functions and device shadows, all of these things which are useful for IoT devices. This is what we could be using, for example, in combination with the OpenCV library. The one thing we haven't covered now, but I think we won't have time for it, is the history of MIRAI, what happened there. If you're not up to date, the authors... There were two of them or maybe three, a third collaborator of MIRAI have been caught and they pleaded guilty in December, which is kind of an interesting part of the... the story, but the MIRAI, just to give a overview of that, the MIRAI botnet was one of the things used to attack Dyn and the whole Dyn-controlled network, a lot of other high-profile groups and companies were brought down for a long, long time. That was, I think, in 2016. We typically look at the source code and how it was controlled and what the scanning feature was and what IoT devices it attacked, like set-top boxes and webcams and how we might protect against that in the future. We can also develop our own webcam using Tesla boards or Beaglebone Blacks or Mino boards and use Node.js JavaScript or Python. That's always a lot of fun. That takes a few hours as well, but I think because we're running out of time, I would just invite questions of any kind. I think anyone have questions on any of these things, even MIRAI, yes? I remember you did the voice processing on the edge of the network. So why did you do that? I mean, normally in IoT, these edge devices are very, very small and then they send the data in the cloud and do cloud processing. Why did you do it on the device? Well, if we look back at the... Yes, I'm sorry. The question was why we did the processing on the edge internally on the local network, so to speak, instead of sending the data out to the cloud. Did I get the question right? Yeah, but you did both, right? I did both. Yeah. So yeah, I wanted to show the difference a little bit about the utility. If you do have a stable network connection of using the cloud for some of this processing, it gets even more obvious when you do computer vision because tiny devices, for example, BeagleBone Blacks, they just simply can't do any real-time smile analysis and they can do some face detection, things like that, some edge detection, but it's very, very poorly granular. So that's the reason I did it internally to our local network as well as externally. That's kind of in a nutshell. The typical devices we use are NUC devices, which have, I think, core i5 processors inside. It's very fluid. Everything works very well, even computer vision. In this case, well, I have an older network, an older notebook, and it works okay. I tried doing this on a BeagleBone Black yesterday and completely failed. So I couldn't even bring the necessary Python libraries onto the board. Any other questions? Well, I kind of missed the connection between Mirai and computer vision. Is the computer vision targeted to be used on Mirai control devices? Can you repeat that? Is somehow computer vision targeted on Mirai control devices? So the question is, is open CV or computer vision part of the Mirai scenario? And I would say no. The Mirai botnet was strictly TCP-IP. It was a network thing where they were just launching a denial of service attack via IP packets on other devices and bringing them down in that way. So there was no real computer vision or something like that. I brought that into this presentation because I consider it a future risk where if you have this type of DOS or denial of service combined with a analysis of the surroundings, either audio or voice, all of these things, temperature and vision, you could do a lot more damage if you consider the type of sensors that are in a fuel plant, a processing plant or an energy plant of some kind. And then bringing the computers down is one thing, but understanding the heat and exchange systems is another. If you manipulate those, then you can do a lot more damage. Any other questions? Could take one last question. So to link the two previous questions, basically from a security point of view, it's not good to send personal data like audio and video to the cloud, but then the cloud has the processing power to process it. So could there be a middleman where you send something to a local central unit that does processing, for example using Sphinx, the processing that things cannot do, but still you do not send the data over the internet to the cloud. So do you foresee a middleman in this exchange between things and the internet? Yeah, so the question, well, first of all, I think you answered your own questions. A wonderful answer. The question is, we have seen how it works inside the edge, the processing that's available and what kind of power you need, the resolution and the results, what quality you get as opposed to sending the data out to the cloud, which offers a much higher resolution and processing power. So it's quite a bit of a trade-off and part of the question is, if it makes sense to have a middle solution, maybe a hybrid of some sort, where you're doing things like using the Sphinx library for audio processing, I think that's what you're suggesting part of the time. And then if you have a stable connection or you have the need, it's expensive to go out to the cloud, so in all of those situations, you do the trade-off analysis and then send the data out selectively. That would be kind of what I consider a middleman scenario where your middleman is doing your work for you and storing or forwarding the data according to a rule set, right? Yeah. So it sounds good to me. It's based on trade-offs. If you want to send sensitive data out to the cloud or if that's too sensitive, you want to keep that internal, just do local processing. That makes a lot of sense to me, doesn't it? I think it's going to be the future. In my opinion, if that's the future of this type of sensory network and processing of optical and audio data, I just think there's too many scenarios to speculate on, right? Sorry, that's a bad answer but non-conclusive, I suppose. I find the hybrid approach very good and in every case, you need to look at what the security profile is and so on. Actually, antivirus are using this kind of stuff to analyze data, so it's based on confidentiality. If you need low confidentiality, you do it local and if you need greater confidentiality, you update it to the cloud and let the cloud give a better response. Leave it at that. Thank you very much, Michael. You're welcome.