 Mae'r fawr yng nghymru sydd ymddangos ni'n bobl ych Evein, a'r fawr, ong nesaf fy modd rhywbeth o hwnnw, rhaid i ddedig iawn. Prydol i ddweud o ymddangos ni'n gweithio o holl i bawb ychydig o gael. A o'r Ymddangos Lleidio Ddechrau ac te warmad a'r ystafell. Sian o'r fawr, o'r ffwyr, o'r fawr cyntaf, o'r ddweud fawr ar gyflwy chef. Non gweld yn ymddangos ni dweud y ddweud hynny yw'r byw i'w cwnghwys i gyfemi'r gwybod i ddweud dwi'n ddweud y ddweud o'r defnyddio. Ond mae hi'n gallu gweithio gweithio gweithio gweithio, a'r hynny, yn hyn yn credu ddweud o dweud ymnyddiaeth y gweithio yn gweithio. Dwi'n gweithio gweithio a'u cyffredinol, oherwydd mae'n gweithio'r gweithio, oherwydd mae'n gweithio'r cyffredinol. Ond mae'n gweithio'r cyffredinol yn cael ei wneud. Ond wedi bod i gweithio gwaith y dyfodol y gallu gymryd yn ychydigon gwaith. Yn ffwrdd, mae'n gweithio'n gweithio'n gweithio'n gweithio gyda'r cyd-don. Mae ond mae'n gweithio'n gweithio'n gweld yn fwyaf. Dwi ddim yn cael ei ddim yn gweithio'n ddechrau'n gweithio'n gweithio'n gyffredinol y Lleidio Lleithio. Mae Lleidio Lleithio yn gyfredinol yng Nghymlwyr Ffraimwyr ar y Montréal. Ond yw'r cyflwyno am y dyfodol sydd y bydd sydd y dweud y cyflwyno'r cyflwyno'r gwaith yma. A'i gael y dyfodol sydd y ffysigol, ac mae'n ffordd o'r newid o'r cyflwyno cyflwyno, yn ymdweud y teithio i'r cyflwyno, a yma yw'r cyflwyno'r cyflwyno. Yn y fwyaf, ddwych, mae'n genneud gan ffanant, a ddau'r ystyried, ddau'r ystyried yn 2014, ddau'r ddau'r fun, ddau'n dweud yng Nghymru Cymru Llywodraeth i'r projessu naturlau gyda'r ddau, dweud yn Sg. Ac mae'n gweithio'r papur. Yn ymgyrch, mae'n gweithio'r ysgol yw ddau'r ddau'r prif, mae'n gweithio'r ddau'r prif, ac mae'n gweithio'r 20 o 30 o'r rhan o'r hyn o'r hynod. I want to take a state-of-the-art TensorFlow model, I want to solve a problem that it wasn't trained for and so I'm going to be using deep learning here as a component of my solution rather than the primary focus of what I'm trying to build. So this is kind of, in a way it's a more industrial commercial kind of application for what's going on here. So the goal for this kind of toy problem is I want to distinguish pictures of classic and modern sports cars. You'll see some classic and modern sports cars a bit later. It's not so easy to say what the difference is. Obviously it could be different types of images and it could be lots of different classes. So I'm just doing a very simple two-class thing but it's quite complicated images. What I want to do is I want to have a very small training time so I don't want to be retraining some huge network. Particularly if I've only got in this case 20 training examples. So I'm not going to do any fantastic million image training, I've got 20 images to choose from. I also want to be able to put this in production meaning I want this so I can just run it as a component of something else. So basically one of the things which has been powering the deep learning field forwards is an image classification task called ImageNet. And this has been a competition where they have 15 million labelled images from 22,000 categories. And you can see some of them here. If we go for this, this is a picture of a hot dog in a bum. And here are some of the categories which will be something I can't. Some food I don't know. These are hot dogs. Lots of different pictures of hot dogs. Lots of different pictures of cheeseburgers. Lots of different pictures of plates. So the task for ImageNet is to classify for any one of these images which of a thousand different categories it's from. And it used to be that people could score adequately well and were making incremental changes in how well they can do this. But the deep learning people came along and kind of tore this to shreds. And in particular Google came up with Google Next which is what we're actually going to use here back in 2014. Suddenly this stuff is now being done by further iterations of this kind of theme better than humans can. So the way you can measure whether something is better than humans is you take a human and see whether it beats him. The question there is are there labelling errors? So there you need a committee of humans. So the way that they label these things is by running its own mechanical turk and asking people what category is this cheeseburger in. So the network we're going to use here is a 2014 state of the art. It's called Google Onet. Also called Inception version 1. The nice thing about this is that there is an existing model already trained for this task. And it's available for download. It's all free. And there's lots of different models which are out there. There's a model zoo for TensorFlow. So what I have on my machine and this is a small model so it's a 20 megabytes kind of model. So it's not a very big model. Inception 4 is more like a 200 meg model which is a bit heavy. So I'm working here on my laptop. You're going to see it work in real time. And the trick here is instead of the softmax layer at the end which is, I'll show you a diagram and it should be clear to anyone who kind of is following along. Instead of using the logic to get me probabilities, I'm going to strip that away and I'm going to train a support vector machine to just distinguish between these classes. So I'm actually not going to retrain the Inception network at all. I'm going to just use it as a component, strip off the top classification piece and replace it with an SVM. Now SVMs are pretty well understood. So here I'm just using Inception as a featureizer for images. So here's a network picture. Basically this is what the image network is designed for. You put in an image at the bottom. There's this black box which is the Inception network which is a bunch of CNNs or convolutional neural networks, followed by a dense network followed by this logits. And this logits layer is essentially the same as the 0 to 10 that Sam had for his digits. This is 1 to 1000 for the different classes for ImageNet. Basically to actually get the ImageNet output that uses a softmax function then chooses the highest one of these to give you this is the class that this is in. Basically what I'm going to do is I'm going to ignore this neat piece of classification technology that they've got and just say, well let's use these outputs as inputs to an SVM and just treat these as being features. Now if we pick out one of these it could be this class could be cheeseburger and this class could be parrot. This other class could be husky dog. I mean there's all sorts of classes in here. But basically what I'll be doing is I'll be extracting the features of these photos saying how much is this photo like a parrot? How much is this like a husky dog? Now it turns out the modern cars and classical cars can be distinguished that way. So let me go to some code. Okay this code is all up on GitHub. And here's my Jupyter network. Okay so... Can everyone see this enough? You can see it. So basically I'm pulling in TensorFlow. I pull in this model. So here is what the inception architecture is. Basically it feeds forward this way. Here you put your image. It goes through lots and lots of convolutional layers all the way up to the end with a softmax in the output. So having done that what I do is I'll actually have a download for the checkpoint. So this is the checkpoint here which is a tar file. Basically I have it locally stored. Doesn't download it now. But it's all there. Even the big models are there up from Google. And they've retrained these. So an inception thing takes about a week to retrain on a bunch of 64 GPUs. So you don't really want to be training this thing on your own. You also need the ImageNet training set is a 140 gig file. Which is no fun to download. Okay so what I'm doing here is basically there's also an inception library which is part of the slim tf slim. So basically this thing is designed so that it already knows the network. It can preload it. This is loaded it. I can get some labels. Sorry this is loading up the ImageNet labels. So I need to know which location corresponds to which class. Obviously the digits version is easy. So here let me just run that. So here we're going through basically the same steps as the MNIST example. In that we reset the default graph. We create a place holder which is where my images are going to go. This is as an input. But from this image input I'm then going to do some TensorFlow steps. Because TensorFlow has various preprocessing or graphics handling commands. Because a lot of this stuff works with images. So there's all sorts of clipping and rotating stuff. So that you can preprocess these images. I'm also going to pull out a numpy image just so I can see what it's actually looking at. And here with this inception version 1 argscope. I'm going to actually just pull in the entire inception version 1 model. My init function rather than just being picked some random weights. Is going to be a sign this from a checkpoint. So this is this instead of when I run the init thing on my graph or in my session. It won't initialize everything from random. It will initialize everything from disk. So this is define a model. And now let's just look it down. So one of the issues with having this on a nice TensorFlow graph. Is that it just says input inception one output. So there's a big blob there you can delve into it if you want. But it's let me just show you. I can go back a bit. So this is the code behind the inception one model. So this is actually smaller than the inception two and the inception three. Basically we have a kind of a base inception piece which is just this. And then these are combined together. And this is a detailed model put together by many smart people in 2014. It's got much more complicated since then. But fortunately they have written that code and we don't have to. So here what I'm going to do is I'm going to load an example image just to show you. One of the things here is that there's a tensor flow in order to become efficient. Really wants to do the loading itself. So in order to pump this get this pumping the information through it wants you to set up cues of images. And it will then handle the whole ingestion process itself. The problem with that is it's kind of complicated to do in a Jupiter notebook right here. So here I'm going to do the very simplest thing which is load a numpy image and stuff the numpy image in. But what tensor flow would love me to do is create a, as you see in this one, create a file name cue. And then it will then run the cue and do the batching and do all of this stuff itself. Because then it can lay it out across potentially a distributed cluster and do everything just right. Here I just want to load some images and have a look. So here I do the kind of the simple read the image. So this image is a tensor which is 224 by 224 by RGB. This is kind of a sanity check what kind of numbers that I got in the corner. And then what I'm going to do is I'm going to crop out the middle section of it. This happens to be the right size already. Basically if you've got odd shapes you need to think about how am I going to do it. Am I going to pad it? Am I going to... What do you do? Because you want everything to be basically... In order to make this efficient tensor flow is going to want to layer it out without all this change of the variability of image size. So it's going to want one set of parameters and it's then going to blast it across your GPU array or whatever. So let's just run this thing. So now we've defined the network. Here I'm going to pick a session. I'm going to init the session which loads the data. And then I'm going to pick up the numpy image and the probabilities from the top layer. And I'm just going to just show it. So this will show... Here is an image. Well this is the image I pulled off the disk. And you can see here that the probabilities it thinks that this is a... The first probability is tabby cat, which is good. So it's also interesting that the kind of the second in line things are tiger cat, Egyptian cat links. So it's got a fair idea that this is a cat. In particular it's getting it right. So this is the same diagram we had before. Basically what you've seen is this going in, this black box coming out and then telling us the probabilities here. So what we're now going to do is go from the image through the black box and just learn a bunch of features. So what I have on disk, and I'm not sure whether I... Excuse me. Let me just show you this on disk. So I have a cars directory here. And inside the... Inside this thing I have surprisingly little data. Let's do it the other way. So in this directory I just have a bunch of car images. And I have two sets of images. One of which is called one, two directories. One of which is called classic. And the other is called modern. So basically I picked some photos off flicker. I put these into two separate directories. And I'm going to use those directory names as the classification for these images. Now in the upper directory here I've got a bunch of test images. Which I don't know the labels for. So that's the game. So having picked... Basically this picks up the list of classics. So it's a classic directory, it's a modern directory. And here what I'm going to do is I'm going to go through every file in this directory. I'm going to crop it. I'm going to find the logits level, which is the... All the classes. And then I'm just going to add these to features. So basically I'm going to do something like a scikit-learn model. And I'm going to fit SVM. So basically this is featureising all these pictures. So here we go with the training data. So here's some training. Okay, my machine is thinking about that. So here's some classic cars. Went through the classic directory. Here's a modern cars. It went through the modern directory. It's thinking hard. And what I'm going to do now is build an SVM over those features. My machine is having a connection for some reason. For some reason my machine is... Sorry, here we go. As they say, this doesn't happen normally. No, this thing is dying. I'm sorry. My apologies, let's just go through this. Sorry, this I'm running through the whole thing. I'm sorry, I restarted this thing because my stupid thumbnail viewer fashion machine. So this whole thing then retran... The actual training for this SVM thing takes that long. This is a very quick SVM that it fits on essentially 20 images worth of 1,000 features. So there was no big training loop to do. And I can then just run this on the actual models in the test set. So here this is images it has never seen before. It thinks that this is a modern car. This one it thinks is a classic car. This one it's classified as modern. So this is actually doing quite a good job out of just 10 examples of each. It actually thinks this Prius is modern. It's not a sports car, but anyway. So this is just basically showing that the SVM we trained can classify based on the features that inception is producing because inception understands what images are about. So if I go back to here, Codes in GitHub, conclusions, okay, this thing really works. We didn't have to train a deep neural network. We could plug this TensorFlow model into an existing pipeline. And this is actually something where the TensorFlow Summit has something to say about these pipelines because not only are they talking about deep learning, they're talking about the whole cloud-based learning and setting up proper processes. So I guess time for questions quickly. And then we can then do the quick TensorFlow Summit wrap-up. Assuming that there's not really good propagation at all. No, this includes no backpropagation at all. To adjust the end result and the features and everything that is happening during this training. So I'm just assuming that in sections of the... You can imagine that if the image net thing had focused more on products, it could be even better. Focus on man-made things. The image net training set has got an awful lot of dogs in it. Not that many cats. So on the other hand, it may be that it has quite a lot of flowers. It may be that it's saying, I like this car as a modern car because it's got petals for wheels. Whereas the classic cars tend to have round things for wheels. So it is abstractly doing this. It doesn't know about sports cars or what they look like. But it doesn't know about curves. So for SVM, you don't use TensorFlow anymore? No, no. Basically, I've used TensorFlow to create some features and then I don't want to throw it away because hopefully I've got a streaming process where more and more images are being shoved through this thing to give me my thousand... Can you, if you're right now, you're just using it for future extraction and doing something completely different? Can you just smash another layer on top with just two neurons? Right, right. So this is if you look at... There is an example called for TensorFlow for poets, I think, where they actually say, well, let's load up one of these networks and then we'll do some fine-tuning. But there you get involved in tuning these neurons with some kind of gradient descent between taking small steps and all this kind of thing. And maybe you're actually having broad implications across the whole network, which could be good if you've got tons of data and tons of time. But this is a very simple way of just tricking it to giving you the... No, but I want to just... Something very similar to what you just did, but it's still inside TensorFlow. Sure. Right. But it'd be a very small network because SVM is essentially fairly shallow. I mean, it is often. You actually had a question at the beginning. Have you quite used the lower layer features like some of that? No, I saw it. So TensorFlow, even though it has imported this large inception network, and basically, as far as I'm concerned, I'm using it as f of x equals y, and that's it. But you can actually say, you can inquire, what would it say at this particular level? And there's bunches of levels with various kind of constriction points along the way. I could take out other levels. I haven't tried it to have a look. There you get more like a picture's worth of features rather than this string of 1,000 numbers. So at each intermediate level, it'd be more like pictures with CNN kind of features. On the other hand, if you want to play around with this, there's this nice stuff called the deep dream kind of things where they try and match images to being interesting images. And there, you do tend to featureise at lots of different levels. So at the highest level, it's a cat. But I want all the local features to be as fishy as possible. Then you get like a fish-faced cat. That's the kind of thing you can do with these kind of pretrained models. There's a lot of flexibility. Or shall I just make... Rather than jigger around the AV too much, let me just give you the kind of the next meet-up things. Right, right.