 Ac sefydlu'n gweithio i chi o'n gweithio yma yma, ydych chi'n ymdweud y pethau, sy'n cyfwilio ymdweud y pethau, sy'n ymdweud y celfogau yn y cyffredin. Felly mae'n bwysig o'n gweithio ymdweud. Mae'n gweithio'n ei ddweud y ffyrdd a'r ddysgu'n ei ddweud. Mae'n gweithio'n ddweud. Mae'n gweithio'n ddweud, mae'n ddweud yn ychydig o'r cyfwilio. Mae'n ddweud o'r cyffredin, Ar y lightly, they were doing 150, 200 and one type things I got through like nobody's business, and then started creating my own different projects from them. That led me on to the Internet of the Day as far as electronics was concerned which is monthly magazines that used to be able to get from wh smiths, et cetera. This took me through to the more complex magazines and these were the more professional magazines of the time. Ac rydyn ni'n meddwl i ddweud o ddordeb. Mae'n cael ei bod yn siaradau. Mae'n meddwl i ddweudio ddweudio. Fe ydych chi eisiau. Ond efallai ddim yn meddwl. Mae'n cael ei ddweudio ddweudio ddweudio o'r urgyn i'i gofyn ar y cyfaint a'n ei ddweudio i ddweudio i ddiweddig i ddweudio. Ac mae ei ddweudio i ddweudio i ddweudio i ddweudio, mae hi ddweudio. ac mae'r ddam a'r ddam o ddweud yn ymddangos i yw iawn. Felly, yn dweud y gall ac o'r rhaniaid o'r cyd-rhyw ymddangos, ac ymddangos sy'n donwch ar y roedd yma, fel y dyst 기nod, rydych chi i gael'r siwr olea fewn i'n gweithosau, i fynd o… y lluniau'r funud yn dweud yma. While all eich penderfynion gweithio sy'n byddwn i'n gweithio'n ddweud, rydw i wedi'u lleiwch bod ein bodisiwchmateb gweithiwyr, machine language, et cetera, et cetera. So that set me on a different course and then this evolved, we see these eight bit processors, things like 6809, 6502, bringing out lots of different types of mini computers and micros as they were known here, which meant that I had to understand this, which is the von Neumann architecture from this chap. And I had to learn all this stuff. Of course, in those days, you couldn't do it on the screen many times. You had to do it in a notebook, on a bit of paper, et cetera. Actually, when I first went to college, that's what we typed our programs into, using Hex, a very laborious process. I wouldn't recommend it as a technique, but it does make you discipline to write your code right because one wrong step and you have to start again. And then later on, we even had things like printouts like this. You'd have a trolley go round and you could connect to your micro system and actually print your program out and then you get things like the app will come along. Meanwhile, something else I'd been interested in was artificial intelligence and its various different means. One of the most important, one of those, from about 1953, I believe, was the introduction or the invention of the perceptron by Rosenblatt. He was trying to simulate what he thought a neuron would do inside the brain and this is what he came up with. It's a very simple construction. So, on the left-hand side there, you see all of the various inputs or signals that are coming in. Those are then amplified, if you like, or multiplied in a mathematical sense, by different weights, in this case expressed by W0 through to WM. That is then all added up using a summer. That output, when it reaches a threshold, will either trigger this to fire or not trigger it to fire. And that's kind of what goes on with neurons. It's a bit more complicated than that. This enabled you to teach. So, from certain combinations of inputs, you would get an output or you wouldn't get an output. The perceptron itself is actually slightly limited. There are a bunch of logical things that it cannot do, like exclusive or, for example. But when you start combining them together, you can do exclusive ors as well. So, later on, what this became, or one other point here, we've got an activation function here, which is just a simple threshold, but normally that's some sort of squashing function. It's a non-linear part of the transfer. Hold on, excuse me. Disaster struck. OK. So, what Marvin Minsky was very interesting in doing was combining these or building on things like the perceptron and building networks of these small units to create what are in effect artificial neural networks. When I was at Kingston Poly, I discovered this book, which is called Parallel Distributed Processing, which is a bit deceptive nowadays, but at that time it has really focused on different ways of calculating things with relation to artificial intelligence. So, it covers things like Boltzmann machines, perceptrons, and other devices of the time. Then, of course, this happened, which led to the PC, and so, for me, that meant writing some of this kind of stuff instead of working on AI. The IBM 55SX happened, which was one that I had, and I did actually return briefly to try to programme neural networks using C++ at the time. However, the 55SX had an optional co-processor, a floating-point co-processor. The one I had did not have this, nor did I have the money to pay for it at the time. So, it's very, very slow, I got very frustrated, and I thought, oh, I'll come back to this another time. Meanwhile, life carried on, this faded into the past, I got on with some more programming, Windows happened, and then I ended up working in graphics cards, among other things, back with my electronics roots. Those things turned into these BMFs. What happened on the micro-processor front is that Intel just kept stepping it up, they ran out of clock speed at some point, went parallel into cores, but what we find now is when it comes to working with different workloads, when you start moving over to doing something like artificial intelligence, particularly with neural networks, even these sort of architectures struggle to chew through the numbers that are needed to do for each of those neurons. So, this is an example of a convolution neural network. So, in this case, the purpose is to take image pixels, or values in, and to actually convolve them into segments and then use neural networks to actually provide weightings. What happens is the neural network will learn certain features, like a line, a straight line, an edge, or a pattern, or a shape, and these will go together to form certain outputs. So, in this particular example, it's called a classification example. So, what will happen is you present it with a number of images, and maybe you are looking for pictures of a face, so it recognises faces as opposed to cars or something else. So, you would actually train it to recognise those things. And this is where these come back in, because these graphics cards implementations turned into thousands and thousands of very small, very fast floating point calculations. So, that is the inside of the Pascal, which is the latest engine on the NVIDIA, graphics acceleration cards, and also their numerical accelerating cards, because they're not selling these into just the graphics marketplace anymore, they're selling it into all sorts of different markets to do with high-order processing, and neural networks, artificial intelligence, et cetera. So, coming back to this, I realise that in order to make robotics a bit more useful, we could do with some embedded intelligence. We could do with some of this actually inside our robots. However, what we can't fit inside our robots and our embedded applications are ginormous data centres, the likes of which Google, Facebook, Microsoft, et cetera, are using to do their artificial intelligence and their machine learning. So, we've got to find other ways. This is kind of what we want. Wouldn't it be nice to have a brain on a chip that you can just plug in when you need it? That's ideally where we're going to end up, but we're ways away. If you remember from last year I pointed some of this stuff out, the old technology is the old traditional von Neumann architectures, and then there's the new technology that is kind of crossing this boundary as the processing requirements go up and it becomes very highly concurrent and parallel. So, what's the difference between machine learning and programming in terms of solving a particular problem? In traditional programming, what you have to do is you have to write all of the rules up front. That means you need to know what's going to happen up front in all cases. That becomes increasingly difficult and it's very, very difficult if you want to use that on something that is in the real world. So, for example, a robot that needs to manoeuvre around a house, avoid obstacles, avoid people, recognise people, et cetera. It's very difficult to come up with a set of rules to do this. For many years, artificial intelligence, there were proponents of the rule-based approach versus something else like a neural network approach and machine learning approach. Coming up with those rules is very difficult. If we look at machine learning on the other hand, it's a very different kind of arrangement and I'm going to try and define here what actually happens. The machine learning part, which is coloured up orange here, starts with having some training data. Again, it's the old adage of garbage in, garbage out. You need good data to train whatever your neural network or machine learning algorithm is going to operate. It's very important that you process your information. In the case of facial recognition, that will be a good selection of images containing faces. Different angles, different lightings, different skin colours, different tones, black and white colour, et cetera, et cetera. Normally, when you do that, you also split your test data up. You put some aside that you don't train your machine on because you're going to need that to test to see how well you've trained it afterwards. You can't do that if you've already used the information because it just recognises it. If you train it too hard, it becomes actually stuck and it doesn't do its job in a general way. There are all sorts of little gotchas when it comes to machine training to do with the data and how you train it. The second part is what's called inference. Here's where you've taken that trained neural network model with all those predefined weights now set and what you do is you apply it to new images. You would show a new facial image to this network in this case and it would then come out, this is or this isn't a face, for example, or this is a girl or this is a boy or this person has brown hair or black hair, et cetera, et cetera. Whatever you've trained it to look for. There's two very discrete steps and these can be separated. Just going about how we train each node on the neural network is we look at the output. This is in the supervised learning case and we compare it to what we expect the output to be and then we take the difference between what it thinks it is and what we think it is and we feed that back into the network so we adjust the weights to try and move that towards the direction we want it to go. It's a bit more complicated to that in a real neural network because you haven't just got one perceptron, you've got layers so you have to feed back the information back and the error using something called back propagation and you can use stochastic gradient descent in order to home in on those errors, reduce the errors and refine your weights. There are different ways of arranging your network for different tasks. I'll just give you a few, the simple perceptron is there on the left and then a feed forward network. If you look at the feed forward network on this example, the yellow are the inputs, green are hidden, they're not connected to the outside world, they're containing the encoding for the neural network, the learning and then the red in this case or the amber is the output required. Some of the more exotic ones are the recurrent neural networks on the next line and the long short-term memory. These have been very useful in dealing with things like voice recognition. What happens here is you haven't just got a planar set of inputs, hidden layer, hidden layer outputs. What happens is you take some of the inner layers and you feed them back in time to another layer and then back in time. You're breaking the network up over time steps because what you need to do if, for example, you want to do a voice recognition, you can't just take a single moment in time and determine what that word is going to be or who that person might be. You need to take a sample over time. The pattern is a pattern of changes over time. The learning has to occur over time divisions in steps and that's what recurrent neural networks and long-term short-term memory type neural networks offer. So what tools would you use to actually start manipulating neural networks? Machine learning tools. The really good news is on this front is that pretty much everything in this marketplace is open source. I know of very few pieces of software that are not open source in the machine learning world. Probably the most commonly known one is TensorFlow, which is what Google use and that's what they expose to the public. If you want to build neural networks that run on their cloud infrastructure, their image recognition, et cetera, then you'd use their TensorFlow. One thing that does stand out in this is Python. It seems to be the most common language within machine learning. I'm not entirely sure why. I think it has to do with the fact that it grew out of the data processing. Python is a very good language for data processing with things like NumPy, Pandas, PsyPy, PsyLearn, et cetera. So all the libraries were there early on that brought in Python developers. You'll see some common NUN vendor-based open source Python libraries, Fiano and Keras, are both very well-known ones. Quite easy to use, very abstracted, that make this whole process much more easy to use so that you don't have to work out those individual calculating of weights and sums that's all done for you. And these will have standard network topologies that you can kind of pluck out of objects and assemble together. Other common ones are torch and cafe. These ones offer C++ as well as Python type wrappers around them. But those three sections at the top tend to be quite heavy, heavy tools. They tend to be used for these data centres and very large implementations of neural networks. They're not necessarily the most efficient. Many of them do have accelerators underneath. So if you have an NVIDIA card, for example, with the CULA libraries installed, it will use those to do the calculations and it will speed machine learning up considerably because it can be very slow to do the machine learning. Then if we move down, I've got another one here. It's called L, which is embedded C++. This is actually from Microsoft. They've written it to try and be very efficient to run on small embedded platforms like mobile phones, embedded systems, Raspberry Pi, for example, does an implementation of it for Raspberry Pi and a number of others. It's quite interesting to see that coming from Microsoft. It seems to be one of the better ones out there. Again, it's a bit more lean and mean than some of the others. The library dependencies on those top three can be quite large and it can be difficult to put on something like a Raspberry Pi. But if you go down to something like L, you're going to get it on the Raspberry Pi quite easily. In fact, their example on the GitHub page for that actually uses Raspberry Pi as an example. If you want to go down very low level and just build very simple neural networks, there's a library called FANN. I've seen this being used on microcontrollers. If you've got a really tiny, tiny, teeny, tiny neural network, you can use that. That's basically just a C library with the hard bits of the weight and array manipulations done for you. There are many others. There are so many now. It's difficult to keep up with them as they come out. This is a very hot area. How are we doing? One of the problems with most of the neural networks is after you've created or taught your network to do something, you quite often end up with something very large, particularly when it comes to things like image processing. You process a lot of data, store a lot of weights. This can really slow things down. In many cases, you're using Teraflop-type video cards in order to get you through this process. This is a problem when it comes down to embedded or as the cloud providers would like to... the edge running of these neural networks because clearly you don't have Teraflops available to you on the average embedded system. One of the things that you can do is you can take a trained network and you can compress that network down. You can train it in a data centre effectively using simulations or just pure data images, etc. Then you can actually compress that down into a smaller format. When you would create these, you'd probably use full 32 or 64-bit floating point implementations to represent those weights. By the time you finish this compression process, you'd probably want to go down to about eight or nine-bits representations for each of those weights, either using integers or fixed point or have even seen something called a dynamic point implementation. There is a lot of trickery involved in doing that and a lot of research is going on in order to make that happen right now. What are you going to be running this? If you've got a very powerful system in terms of you've got lots of power available to you, you might be using something like the NVIDIA Jetson, the TK1 or TK2 embedded GPUs. These are the same as the graphics cards, but they're not quite as powerful. They're much more powerful, for example, etc. We're still talking tens and tens of watts here to run these. They will offer literally teraflops worth processing on an embedded platform, but it will generate a lot of heat. You need big heat sinks on these things, etc. It's good if it's going in something like a car that has a very high capacity battery. Not so good if you want to put that in a miniature robot. Something else that we've seen is commonly ARM will be used as the main system running something like Linux, and then you might add in something to speed things up. ARM does have the Mali chipset. They do also, in their next generation, they are going to be accelerating neural networks as well from an inference point of view. Right now what you can do is add in something like if you just want to play with this, you've got the Movidius compute stick that Andy recently reviewed. It was a doubling start-up that was bought by Intel. Inside that little Movidius stick is a small processor. It's tiny, it's about this big. It's capable of doing 4 trillion operations per second in less than a few watts. It's quite incredible. It won't do floating point at that rate, but it will do basic integer-type operations. It has a mixture of 16-shark processors, a normal processor or dual processor. Plus it has hardware convolution and things for image processing. Most of the applications so far have been put on things like drones for image recognition, et cetera. The other pin down at the bottom here, I'm talking about using FPGAs to do acceleration. You can provide a soft core that is dedicated to speeding up the artificial intelligence parts of it, i.e. the neural network parts. I'll give some examples. I know I'm showing my lattice ice here. You'd normally use something slightly larger. You can't actually fit much in a lattice ice, but they have... They've even crammed in some very simple inference for things like voice recognition into these small lattice ice chips as well. This market is opening up all of the time. What are the approaches if you were using an FPGA? How would you go about implementing that on an FPGA? One way is just to make a whole bunch of very small processors and have them work in parallel. This is very popular. What happens is they take a neural network structure and they convert it into effectively a bunch of C code running on lots of little distributed processors in parallel. That's not very efficient, but when you've got lots and lots of FPGA resources like Intel have on their new Autera chips, then it actually works quite well. They have very high bandwidths to memory. The data centre, that's a really good way to go. It's probably not a good way to go because of the inefficiencies. As I mentioned before, compressing or reducing things down to an FP8 or 9 bits or integer 8 bits enables you to run that in DSP units inside an FPGA. You want a DSP-rich FPGA fabric. A DSP unit is really just a multiplier and adder unit that sits in an FPGA. You just want to string all of these together and have memory in between them and just run the whole thing in real time feeding from one section to the next. You can go another layer further. Instead of using FP8, 9 or 8 you can actually binarise a neural network. That means you've literally the output for the neural networks is 0 or 1 and so are the inputs. But this is a very complicated procedure. You are losing lots of accuracy and it's still very much a hot research area, but there isn't any easy way of doing this at the moment. One other approach is the neuromorphic approach. The neuromorphic approach tries to emulate what the brain does. Rather than you just crunching numbers through memory, what this does is it actually sends spikes or pulses of different whips between different units. You can actually build a logical FPGA unit that is sending addresses and values over a kind of multiplex bus. This tends to be quite efficient in terms of power usage and things like that. We'll see more of this making its way into the marketplace I think as a solution. Where are we going to use this stuff? Probably the biggest killer app from an embedded point of view right now is in the automobile industry. There is a ginormous amount of money being spent right now by pretty much all of the automakers plus all of the new players such as Tesla etc. What are they using machine learning for? Well at the simple end you've got PID tuning for the steering. You've got a simple algorithm called twiddle. I don't know if anyone's sinned that. I may cover that tomorrow in the workshop if people are interested. It's a really good way in. That just tuning for the PID parameters. Then you've got things like recognising signs. So again you've got image processing here, object recognition using convolution or convolution neural networks. You've also got things like analysing the camera information telling you are you in the lane? Are you going in the right direction? So when it comes to steering it needs to analyse where those lines are, where the road is, where the convergence point in the road is etc. But more than that you have to do things like look at the obstacles the other vehicles in the road if you need to do a lane change etc. So there's a lot of pattern recognition going on there, lots of image recognition. So as far as machine learning goes this is like a bonanza this market and so the automobile industry right now is eating up lots of engineers to actually solve these kinds of problems. And that's not even thinking about the higher level things of do I run over the mother with a push chair or do I run over the five kids if I've got a choice type Del Amos that they have to solve. What would you want to use machine learning for while planning using neural turing machines? So if you've got a small robot and you want to just move it around a place you can get it to learn certain routes around a house for example or around a factory that kind of thing. Facial recognitions are very common one most of the library has been done you can actually get pre-trained stuff from Google and re-implement that in compressed versions all from Microsoft voice recognition I've mentioned we're going to see voice recognition in many many different consumer products cars pretty much everywhere Siri Alexa etc and robot object adaption teaching robot arms to be able to pick up and recognise different things again trying to write rules for that stuff is really difficult neural networks are really useful for this and then strategising as well using recurrent neural networks and long term short term memories