 All right Glad that you're all here. So yeah, my talk is going to be about it when to use machine learning First something about myself the who am I I've studied method and statistics a research master and During that time I got really interested in machine learning. So even though I was focusing on statistics in my study I was really interested in artificial intelligence So I'm also an Intel AI innovator What I love is open source innovation and particularly human machine interaction So how can computers help empower people And Yeah, I want to thank jibes for enabling me to speak here today I'm a senior data scientist there. I've been working there for almost four years at 15 different companies Dibes itself has 35 data scientists and In my four years I've worked on Blockchain Social robots from which you can see the picture Natural language processing for example chatbots or Analyzing the news like reddit or Twitter and a lot of purely machine learning predictive and That kind of projects So usually I say something about the open source projects that I've done, but I want to keep them for the talk actually so I Don't know if anyone's familiar with the gardeners hype cycle It's an interesting one to follow because yeah, they're discussing which thing like where are things in the trends with tech and This is where you can find me so I like to do these kind of innovation topics and I also forget help and so yeah, you can already have a look there So today I Want to first try to explain what machine learning exactly is and I'm not from like formulas perspective But like just to get you give you an intuition so no formulas and Afterwards I will tell you when to use machine learning because it's not you shouldn't always use it like that's a spoiler Yeah, I'm sure that a lot of you already have ideas what it might be or have applied it, you know Clone some get-up code or and then just run some example code or maybe a bit further than that But yeah, I hope something of this will be inspiring to you So let's start with what it is So this is an example where we have like the simplest possible data it's kind of sociology data that was my ironically my my major So I guess Does anyone to guess what that what the numbers would be there or like what would we predict as people here? Yeah, I think that it's pretty clear to see immediately like that's that's what we would think and Well, so yeah That's that's indeed flag what I compute what you would like a computer also to predict right I mean we we have this intuition, but how I kind of computer learn that so You wanted basically you want to compute it to be able to generalize from the example So it would for example learn on the first three examples and we able to generalize to other examples that it hasn't seen before So in this example You you would train on the first three and you would hope that it can predict the other two right? So let's say this is your whole data set. You're going to split it up And then you would learn on one part and validate your model on the on the other part Next day I wanted to show you this also the example in Python code Let's see how the switching is going so I don't know who of you is familiar with scikit-learn But it's a it's a very good library which helps you develop machine learning models And it popularized the fit transform and fit predict So like as long as you're still kind of making a pipeline you're doing trans fit and transform So you learn how to apply like Like for example with the age and the income it's kind of a function that you're trying to learn, right? You have an argument someone's age, and then you try to predict the the income and scikit-learn, yeah, it's really good for this kind of like stuff where you want to train on some data and then Yeah So I don't know if it's readable. I guess I'll zoom in a bit So let's first then create some data This is usually how you name the variables so the X is what goes in and Why is what you want to predict? Sorry. All right. Thanks so Can you evaluate it? And so this is this is how the variable would look like And the y variable Looks correct, right? So now we make a model and the idea is that you fit it on the data So you want the first three examples you want to learn learn on that and then you can predict The rest so yeah, what you can see here is you're you're using the first three examples then you're predicting the last one So yeah, this is like a way to learn a linear Relationship but the interesting thing is that in machine learning there are a lot of different ways that you can have a model And this is the simplest variation where you know, like numerical data And you will basically any model will have some kind of special like features that it learns from parameters that that's the thing That you're trying to learn so Here you can see that the value is thousand which means if you multiply the age by a thousand you'll get income So actually the interesting thing is of course what would it do for numbers that it hasn't seen and in this case? It will yeah, if you say 70 it will predict 70,000 obviously Okay, so this was a numerical example now kind of an idea of like Decision trees which you can draw always like if if this then that then that and there's two things here Like this makes it so that you can have a non-linear kind of like transformations or predictions and In the end you're predicting your yes or no, that's like not on a miracle value. So that's also possible and Another example is first of those is is predicting spam. So hi, John. How are you? We would say that that's not spam and cling cling for free that that might be spam Unless your name is John, then probably also the first one is spam. Yeah, so Rather than like what could we use machine learning for rather than write a lot of if else statements? You can learn base logic based on the existing input output examples So the steps here are like find a problem do some pre-processing in our example. We didn't have to do any Find a model that works and then use this best model in production So now I want to bring your attention to the following The significance of machine learning. I don't know if you guys know fan diagrams But you know, this is this is how machine learning is, you know or or actually I think it's more like this like machine learning is just a small part in this right you Like you want to automate some process and machine learning can be part of that. So now it's time to go to the examples Yeah, so this is a library that I put online and I Wanted to explain to you how I came to this idea I had installed arts linux on my Mac book. I would not recommend that. It's like it's a horrible idea Yeah, I don't try it at home But at at night Programming at night. I've noticed that there's difference between the like between the colors So you have those that are trying to know, you know, it's it's dark So it will apply some like orangey filter. I never got used to that one But what I noticed was that the browser which is mostly white Was still really bright compared to when I looked at my editor, which was really black so I thought, you know, it would be cool to kind of take that into account and I Also thought like it would be cool if instead of having like a lot of configuration That it would be cool if if you would have actually no configuration at all and it and you would be able to still like Like Be able to do something about it. So Yeah, this is example data of the bright ML that it takes and Maybe it's fun to show it. I'm not really happy about switching because it's a bit slow, but So, yeah, that's a new version, right? They always get these these things so whenever I switch between a Screen can see that it applies new brightness like at the for example, the last one is 73 and Going back to I don't know if it's visible on this screen. I guess not Now it starts doing it Okay, well It's mostly designed for a laptop. So I only experienced here if it would work on external monitor, but The idea is that yeah, you need to collect features that can help you predict it and These are the ones that I've got so you see here the new brightness, that's like When I'm raising the the brightness on my computer it changes this value It's a file actually on your computer and that changes and Whenever a change is being made to that then it's being recorded and it records these kind of features. So my battery Power level which application am I I'm in but also that that pixel value that I wanted So that's like a value between 255 and zero, right? So the idea is I want to have a model that could potentially learn the difference between You know high values low values and Maybe the time is important, right? Maybe location and last one ambient light. That's like the sensor in your laptop That's also useful feature of course because yeah, that already does some kind of like are we in the dark or not? So I had a question like does anyone else have ideas like what you would want to use as input here Well, I guess I got quite some here, but yeah, you can I I Actually find out one more yesterday, which was when I was when I was boarding the plane and I couldn't charge my laptop anymore I know it has a bad battery because I'm running Arch Linux on a MacBook, which you shouldn't do so I would have only like like 45 minutes battery or something and Even though my battery was completely full I still did not want I was still wouldn't want my brightness to like You know be full but from the model's perspective, it's going to be like yeah, you know, no problem with battery. That's just You know, that's just too full brightness because it's day. So yeah It's a bit of contradictory example, right? And as a person you would be able to like you can learn it But it's not going to generalize because whenever I'm ever going to be again in that situation It's going to be so rare and this is one of the main problems with machine learning these kind of rare situations So Yeah, the main takeaway here actually I didn't mention it yet But the cool thing about it is I don't have to do anything other than just change the brightness like normal And over time I should just notice that I need to change it less and less right so that's that's really the cool part You don't have to do this whole process of collecting data and whatever, you know, I want to change it It will work for that time you you go to a different situation So it's zero conflict. Well, it's still personalized and I think that's really cool about it But the thing is still you have to think about like which features are available and which do I want to use? but hopefully, you know, this would allow people to create their own brightness setting without too much effort and Brings me to my next one Which is another library So it uses a Wi-Fi signal to detect where you are and Here we go. Let's do it here Europe Python Fintree, let's see if it works. So I have it here See it's in my bar. So my computer knows where it is and I Think it's cool because this one is using a smaller module that I've made Just to get like just to give an idea of How it works So it uses the scanner and you have like a lot of Wi-Fi Inputs like some ideas and how strong your signal is. So that's kind of how this one works is If you're sitting on your couch or you're being somewhere else in the house The computer could know the difference because this the signal strengths are going to be different between the different access points that you have and The interesting thing here again is that is that Don't take pictures when there's an anti-slip No problem So the cool thing here is when we look at this one is actually using it So I really like this idea of creating small models that you can then use in like other things And I think that that's just that's where we have to go with machine learning create small models That then can be reused in in other components because I do think that you know where I am is actually going to be a Factor in in how your brightness would want to be or I mean it can be predictive of it. So click ability ski and Well, one of the ideas is easier to learn from you have Observation then to have to say something if my signal strength is this much, you know, like I don't know one wants to do that Right, that's like really crazy So oops, yeah, so Yeah, so I've told you how to solve some kind of X to Y problems, right? some input to some numerical value or some like class and So you can only solve these kind of problems some X to Y problems It's a pretty pretty limited you would think but People have been very creative in posing their problems as X to Y problems actually so for example in computer vision And what would be any respectable presentation without a seemingly off-topic picture? So This is a this is an example of the image net data set It's something like they I think they noticed I think they were predicting dogs and they looked at which ones were wrongly predicted and then they saw this I mean, I think it's very it's hilarious Yeah, so but how does how would you use this in a model, right? How did I do it? well It's like different classes in this case is it a dog or is it not a dog zero or one one is a dog and On the other hand you have pixels and this is the crazy part You have for every image you have like 80 pixels by 80 pixels by three channels like red green blue and Like that is something like I think across like 19,000 or something data points So you have to imagine each of those is a value between zero and 255 it makes it that you immediately have big data with some sizable number of images But yeah, that's so that's an idea of on computer vision and so I want to also talk about one one case at you know from work I've worked for multiple insurance companies and In one of them we wanted to investigate what computer vision could do for them And in this case they wanted to predict the amount of damage like how much it would cost to repair it from damaged car pictures and Well, it took a really long time to get this data because obviously they have not prepared for it to be used like this So we started working with a academic car data And so pretty much yeah, like this is an easy one But they have examples where you have like a tiny scratch and it's very difficult to see like the very small feature on the whole thing So we started with we also we actually made it a bit easier this problem for ourselves Like more on on like sides of the car like this as a start So we were trying to predict which side of the car we're looking at and then you would yeah So it's kind of about localization and I did this over two years ago But yeah, it's a good case because it's you know, it's very time-consuming in a way And so but the problem is is they don't didn't have enough data, you know, and The cool thing is though There's something called transfer learning where you're you can use like an existing model We at that time we used the inception v2 model from from Google. They were training it for like three months on like 30,000 euro machinery something like that and the idea is that they they made this whole network And only actually this like last red one on the completely on the right is the actual prediction is happening like you know This is a this is and in that case they use the image net. So it's like this is a dog This is a car. This is you know, a lot of like I think a thousand different classes But the cool thing is the part before that very last one that can still be reused in other cases So in our cases so features that are useful to help the dogs might not be that useful But you know, there's always some kind of features that it will learn to represent this whole date to learn this whole data set that you can use in another task and Yeah, it was it was a very interesting project it was when TensorFlow was 0.8 and We we made or we used their template to create an Android app We changed it so it would accept our model and so it's still fun to to walk around and Projected on cars and have a laugh every now and then Yeah, so what is typical in an insurance company is like they have strict strict rules already in place So that yeah, this was innovation. So that's that's one part there transfer learning can certainly help But you know like image data. It's it's a very difficult one. And I think I missed this point earlier We advised them not to continue on this one because it was not their core business this car Like that was just one of their things like we're very broad insurance company And we advised against you know going forward with this because it's just not their main like their cool thing And it would be very expensive to get to label all their data and and like that so and Another thing is like, you know, we are going to have a difficult one with with like compliance where you're going to be like You know 60% of the time it's a it's accurate like compliance doesn't really like that so I Wanted to go to another complex problem. I don't want to go too depth too much into depth in this one But I thought it would be fun to have a neural network learn to complete neural network code so so yeah There's a lot of generative models out there So you just give it a bunch of text and it will learn to predict the next word or character based on the things that I've just seen before so there's a generative model and That can be very generic it can be times the time data or it can be like text or or image images even But I don't want to go too much into depth about this one I Think that gender generating in company is not that interesting usually Unless you're making really that to be your thing like for example, I think is amazing What kind of art they make like Google made something called deep dreaming or something. It's it's very interesting art But yeah, so these are just like inspirational ideas. It's not really wouldn't really recommend Like this as your next project though. It can be interesting, of course So next one. I don't know who here has cryptocurrency at the moment. Okay everyone sold it off already Yeah, so This is actually my only personal closed-source code And Actually three years ago. I was working on the blockchain and at those times. I was also a bit skeptical of it I'm still quite skeptical of blockchain But yeah, I was always laughing when companies were were saying something like Yeah, we're combining AI IoT and blockchain and this is going to be the thing You know, like like now you have like three problems instead of like one Yeah, so But yeah, and those things are also I didn't have a lot of money And I thought it was also going to be way too expensive to trade these things like thinking about stock where you know It doesn't make any sense unless you have really big volumes And that I was already way too many people doing it But Actually a few months ago. I thought, you know, let's just collect data and let's analyze some of it and You know, most of the models Like the popular ones then they applied and latest machine learning techniques hoping that it's you know, it's going to give them an edge So basically what they're just doing is take this like price data over time And they hope that they're able to predict like if it's going up or it's going down It's what most people are doing I Thought I wanted to do it for for some time, but I thought, you know, yeah, that's not that's not really Like that's really risky or you know, I don't have anything to say about I cannot control anything is it's going to be You're going to wake up and anything can happen So, yeah, I thought, you know what I'm going to analyze the data I'm going to see like what are the most obvious things Like I noticed at one moment that a coin was like one of the coins was going to work together with With with Microsoft and that's really increased the price enormously So I thought, you know what if I'm just going to monitor for such events those things that, you know It's going to be obvious for everyone that the price is going up, you know that maybe I can do something with that So it just makes something very simple. There's no need to always try to do the most complicated machine learning model And I can also assure you that it's a good experience that your personal money on the line like It's a good experience in the sense of When you lose some money, then you're really going back and like monitor You will make sure that you're really doing good monitoring there and Yeah, another big point here is that you don't need machine learning to create training and test sets Or run simulations, right? So machine learning is just one part just running simulations where you know like the things that I'm doing is like You know just two values or something. I'm trying to find it's not not based on machine learning and a lot of just yeah You do tests backwards, but it's not it's not necessarily machine learning Yeah, don't underestimate the work necessary next to machine learning. I didn't bring that up But yeah, I'm it takes so long before Yeah, because you're depending on the exchanges to to trade and you know from idea to to actually have them to have something like that working it it takes a long time and Yeah, you can also do analysis sometimes instead of like forcing it to be machine learning and For the Python for Python right like simple can be better than complex It also holds for form machine learning or modeling or of any kind So another one that I made is X to Y I gave a talk about that also be Sometime earlier and it's the idea of automating this these these steps of like because a lot is you know You have some data you do some kind of pre-processing on it like missing dealing with missing values because otherwise it doesn't Psycho learn cannot deal with it You do model selection and I mean in the end after you've done a lot of like different projects You have a kind of you start to like a couple of models You know when to apply them if you just throw them together Well, you get you get if you throw this together, then you got something like X to Y and so so I'm loading that library of course and Reading in some data. I think it's this So that looks like this It's one of it's actually my favorite data set to be honest It's like who survived the Titanic they collected that and so you can say something about like this did Women indeed like was the captain last standing or these kind of you know these these ideas but Yeah, women did have a better chance. So that's that's good, right? Thinking that we don't have the most time anymore, but let's see how far we got So I'm going to get you the data again Survived so that's like one is survival zero is not so survived and We have to remove this like so now it's not there anymore And then we can say okay same pattern as before Except now we're like this is kind of messy data. There's missing values There's text and whatever and so I could learn cannot deal with that So the idea was you know, let's just do something. That's very simple There's also similar to the example code, but what what you can say is like let's fit on Half of the data. That's for the people that don't know like that's how you can get half It's a simplification so now it's training and Then you can use it to predict so there we go then you can compare if the Predictions are actually equal to So in this case we got 77% correct Yeah, so the idea here is that you know you have You have image data time series text data and you know There's so much that you have to do like pre-processing if you just start again It's nice to kind of bundle things and this is kind of the thing that I also would recommend to companies Make like this kind of a platform that does this kind of pre-processing for your common things, right? If you're concerned with turn Then make sure that you're like the main things that can be predictors of that are actually going to be there Take already care of pre-processing cross-validation and like what is going wrong with my data. You can do that all and Yeah deal with your core domain features and I mean only the final step is actually the models so You know like just bundle it and if you can make a quick iterations like that's that's going to help you like in a company So then you can see okay, we're missing data or let's add this data and it will be very quick. So it's important But then of course there's always productions I mean that's like the next step there so that that always takes more time like compliance like Proper development cycle. So make sure that you have that as well okay, so Gave a lot of Through a lot your way Let's let's wrap it up in the end machine learning is just a tool But it can be really powerful in the right circumstances like learning this kind of function between your input and Something that you want to predict But it's not more than that right so it's not Like that's some people don't really understand that they think out everything is going to be automatic with machine learning You have to do a lot of work around it Thinker ask yourself Is it easy to create a feed loop feedback loop here? Or if you have to do a lot of effort to create this like new data new annotated data where you get the answer Right like if you cannot collect income data, then it doesn't matter if you have age and it's a good predictor Which is yeah, so this is a very important one Yeah, don't forget to think yourself like what could be useful features, right? It's a bit of a simple one, but yeah Also, I think plugability ski don't try to solve everything in one model make different models and spread out the problem And I think actually that it's going to be very interesting to see what in the next few years People are coming up with models that you can then use in your model Because you know if you look at OpenCV and computer vision They can do facial detection and these kind of things, but it took a long time for people to yeah to build this and But once we have these kind of models for machine learning it will be nice to chain them together And don't try to solve the most complex problems if you if it's if it's like way too complicated Data or you know, like there's so many rules then you know just start with something easier like Especially when many strict rules are there like insurance companies banks, you know, they have so many strict rules if you cannot explain it if you cannot You know cannot reason why you're doing it or if it's obviously wrong why you're doing it like Discrimination or whatever then it's not going to work so and Most people find optimizing models fun, you know get the better score But you know optimizing the model is usually not the best like thing to do here If you have the simplest model, you can still really make big improvements by getting better data Or you know talking to the people that can help you get better data. So this is also a very important one and Never underestimate the work required besides machine learning to get it actually in production Right like even if you have the model, you're very happy with it You know, it's takes time to get it to work in size a whole application environment Build a framework your company Yeah, so I wanted to thank you and Comments say hi If something wasn't clear to you or you want to discuss your own examples We're just chitchat. I'll be here until Saturday and my final and most important suggestion is Make little projects and then give a presentation about your machine learning projects at the next year of Python Thank you Thanks very much. I have a few minutes for questions if anyone has it Have you actually made money on your blockchain project with question? It's you know, I have not lost anything yet. I'm still in the like development phase and Projections are that even if Bitcoin is going down 20% per month, then it should still be okay. Sure. Nice. Thank you Anybody else There are some cases where plugability could hurt performance for example in the Lightness prediction model you were showing Suppose there are like two places that are close by to each other, but The you would want very different brightness values for those and Since the prediction is based on classes like you just predict the place If it predicts the place incorrectly, it could really hurt the performance of the brightness Prediction. So what would you recommend doing in those cases? Well, yeah, it's a good question The the thing there is, you know, you're there's so many other features So you just hope that you know the model will learn to prioritize others So, I mean if you are afraid that it's going to be messy this Prediction then it's the model eventually will learn that it's like not an important feature, right? So then it wouldn't use it. It's you know, I guess in In the example of you know being at Euro Python or somewhere completely different then, you know It's going to be it's going to learn that that's in that case a good example But maybe, you know couch one couch two if you want to learn that difference very close to each other Yeah, in this case how it's parameterized I would expect that it's not going to put too much effort on like on this one So it would just instead focus on on bright like like time or you know the Yeah, the pixel value or something like that. Yeah, thank you Thanks anyone else for the question No, okay. Um, okay. Well the next talk is in here at 10 past 12. Can we say thank you again to our speaker?