 Yeah thanks for coming to my talk even though it's for me it's very early in the morning yeah today I would like to talk about actually I would say a very easy topic and actually a soft topic but not for us especially in a sense of way when you need to scale right so today I'm going to talk about image classification or image tagging especially how we make it easier for people to use it and we build a library which is called image ATM we show this word because it sounds cool because you need a cool name for a good library right but ATM doesn't stands you know for like the banking machine that you know but it's for automated tagging machine before we go into the topic just want to introduce myself so my name is that I'm currently part of access springer ideas engineering which is the innovation unit of the access springer group I'm gonna talk a little bit also as well a little bit short while for about access springer because a lot of people are really confused that they believe it's springer nature but it's not it's a different company springer is a very popular name in the past in Germany so that's my team that's me we won a competition in a internal costume match I'm also building up and heading to spring AI which is the unit of a springer it's a new unit we are a new team and yeah basically you can expect from us more coming up in the next few months also a year so it because it's a it's a new thing and we need a while to to set it up for that I used to work for one of the daughter companies of access springer which is called ideal it's a very big price comparison website in Europe actually the biggest in Germany and that was my team we are mainly focused on commuter vision then a long time ago I was working for a pivotal PIVA labs which is a software company especially I was in the service arm and heading up the the German office for data science and yeah we usually help to digitalize Volkswagen one of the biggest car makers in the world okay so I also do a lot of fun projects you know from just you know face-to-face but also but I would say most of my code is not public so I you know work on things like hydroponetic prediction but also like sorting images according to aesthetic or any other topics and I also like to do a sauce so some of the code is on github and also I like to write about the stuff it's some of them are on medium okay let's go for the Jenner first of all I'm just gonna give you a little bit of motivation why we needed to do that and why it was important for us then give you a short introduction to image classification I think some of you are all aware of what image classification problems are gonna talk a little bit about library showing a little bit of code as well and then yeah just conclude the talk with like also a little bit of roadmap what our plans are for the library and how you actually can help to to to support the library itself in the future okay so to some motivation I want to start with the company itself it's very short so this is built it's a very popular newspaper in in Germany it's popular it's mean it's a it's very tabloid way like you know it was fun I found in 1952 and basically it is our main flag newspaper and since then we change we still we are a Europe's leading digital publisher in Europe we've approximately 60,000 employees we have 180 prints 1 250 prints and we are over 40 countries and of course our main business is still journalism but we also have a lot of other businesses like classifieds price comparison and so on here's some of the brands some of you probably aware of Business Insider so this is a company that is owned by Axiospringer and it's very popular in the US I think also worldwide and especially for the videos and I don't know why sometimes but today I want to focus on idealo which is a price comparison website and so this is how idealo looks like basically so like a typical you know maybe e-commerce or price comparison website what do you do you come there right you're looking for a product right and you search for something like especially when you want to go shopping right and the company itself is in 18 years old they have 800 people so this is about idealo what's a business model the business model is very simple so you have shops like Amazon or auto or Salano or whatever right and they have a lot of data like shopping data right and they just send it to the to the to our website right and overall we have 330 million offers so there's a lot of offers that we have and what we do we do then you know we have this website we crunch it and then you know we have at the end of it and customers right and they ask and customers they just want to have a perfect shopping experience what a lot of people don't know this process is actually a lot of manual work so the team itself we have like 180 people who just you know do this number crunching you know they they type in the product they fill in with more information additional information but then also they sort of images or something like that right or create a product gallery and this is like one of the use case so when you type in a product for example this is a helmet right you go into the helmet and you see you have the name the price and then you have also the the listings right and you know this is like a typical shopping experience but one important experiences you also want to have images right because the thing is when you go offline you can see the product when it's online you know you still want to see how it looks like but the problem this process here is very manual and our goal was basically to to kind of automate that so if I show you a short video how it looks like with a content team who actually need to do that so we have this content team tool they basically go they type in the helmet that is already automatically pulled in and basically they pull in all the images from different shops right so it's like random so if like for example they shows this product here it's called abyss you know they give some suggestion and you know you get a lot of this stuff so and then the thing is the content guy you know he has to sort the images right now right and then of course he has to something delete stuff that is not necessary right or that we don't want to show on on the website itself of course if you do it for one like you know product it's it's fine right I mean this is not stuff it's like 40 seconds 50 seconds but if we have like 300 million products every day and we get like over a million updates every day and you have to do it every day I think it's very mundane and you know you don't want to do that right it's not a fun fun thing and optimally you know you would like to have of this perfect product gallery right where you have left profile left right profile rear front right so you want to have you want to have something that is automatic but also something that you know sorts images already in a way that we wanted to display it for the user so what we do we do so we use image tagging right so we classify it in a way and then afterwards we implement it in our system so basically what it does now is like you know take in the helmets but now there's this button you cannot really see it it's called labor a cannon is in Germany it's called like detection for labels right it was still in the beta phase this time and then so voila so with machine learning deep learning right with our tool you basically can just you know have it in a perfect way without you know touching it so that's that's pretty cool another use case that we have idealo has also a hotel price comparison right so they have like two minutes a commendation that's like around 300 million images so a lot of images so if you divide it by all the commendation you have 133 images per commendation what is the value of photo so photos is very important right so if you go traveling or somewhere I don't know like maybe for Germany you look go in Airbnb or maybe on trip advisor on us right and the first thing you'll just look at is of course like price and the photo right so because you don't want to stay in a place that is not really well right and when we look at our website we had a lot of problems so for example this is a listing for Berlin right so if you just go into this listing yeah it's not something that you really want right so you don't really want this woman as a first image displayed for a listing right and then of course when you go deeper you have a lot of things for example you know when you look at this listing here you have more images right usually it's 100 images so this year is only like a subset of the images that we have and when we go in here you know we also have different hours right so we know that hey this image position 19 clearly looks better than position one right so I'm not quite sure but for you but for me it looks the same thing for the reception I mean for me even position 17 the description looks better from the angle and everything like from position three right so this problem is actually we also solve it so because it's a two-fold problem so beautiful images should appear earlier in the gallery right so you want to have images that look better earlier in the gallery I'm not going to talk about this but if you want to know more about this we had a chance to write article on nvidia developers blog about image aesthetic right how to sort images and also we open source the code right on on github so it's based basically on one of the Google papers about new image assessment and it uses basically a move a distance to kind of model this problem and yeah we had amazing results so this is what's one of a very good projects that the basic finish but our problem is image-taking right so you look at this in gallery again right so what do you see you see a lot of beds right so that bedrooms in a way so what you want to want half is you want to ensure that different areas get depicted so that's where image-taking comes into play right so you want to have bedrooms you want to have bathroom you have restaurants facade finish studio kitchen right so this is what you would like to have and there's more categories right there's also like a free reception and so on maybe pool as well right there's so many at the end of the day and of course we had also a lot of during our time at our request you know from different teams like hey we have this special winter thing you know in in Europe we like to go skiing and you know and during a certain period everyone like you know looks on the internet for some skiing places right and it's very convenient if you are a price comparison website and you can say hey I want to just show like winter images right winter images that display it on a listing so we had a lot of these requests as well as you can see so the problem is like from the first standpoint the helmets is only one category right but overall we have 2,000 product categories so this is a lot a lot of categories overall then of course we have many you know these classes within the categories we have a lot of hotel images at our request and of course this is only one of our companies right if you think about our company itself we have over 180 companies and that means we have a lot of image-taking problems it's not our it's not only the other right so we think about Célogea which is a very big French company in terms of when you're looking for apartments or also when you think about Imuvert which is also German one right they also have like images of apartments right so you also can reuse it for for other use cases okay so the problem though is I mean image classification is a solved problem and you just go on the internet you take a cat was a dog tutorial right take Keras example take a Jupyter notebook right and then you know you just run it through but the problem is we need a tool that is really easy to run right so it really unrun for fast meditation we also needed something for non-machine is non data scientists like soft engineers right because the problem is you don't in any company you just don't have like 10 or 12 data scientists or whatever size of data scientists that understands you know computer vision that understands like image classification right so we needed a tool something that and of course it needs good documentation because software engineers they love documentation right this is how they use it and one another thing is expand every eye right so it's I'm gonna give a motivation later also why expand every eye is important yeah just hold on for a moment okay and just a short introduction into image classification so that everyone of us is on board as I said image classification is a very simple task basically you know you signed an input image a label right a label from a fixed set of categories if you want to know more so this is a link to one of the article from Andrea Kapaski is very well done and basically it's a supervised learning problem right so not not very not very difficult there's a lot of examples that you can find on the internet and this right this is a typical example digit recognition from zero to nine you have fashion eminence which is another eminence version but with fashion from Salano from one of my good friends handshow who put it out a couple years ago you have image net Cypher 10 and of course cat was docked so there's many many of these taking examples how can you solve it I mean there's so many solution out there right so you can even use support vector machines can use feedforward neural networks right you can use convolutional networks you can caps net right so there's many many of this solution that you can use but we mainly focus on CNN's there's no because we are in industry so it means we need a trade-off between you know fast model right and also good accuracy so it's not something that we can say hey we're gonna take state-of-the-art caps and models or any other model right and and basically because kept it is too slow for for many of the use cases and of course another important concept transfer learning right some state-of-the-art concept is you do transfer learning like in traditional machine learning so compared to that basically you have two models right and then you basically train two models learning system right and transfer learning what you basically do is you train a model on I don't know a task that is not related to each right you train a learning system and from there you transfer the knowledge to this learning system right from another task from another downstream chess so that's that's how transfer learning works basically in for images it's pretty simple for example you use a pre-trained convolutional network for example that was trained on million of images like ImageNet or VG16 or whatever right and then you could replace the top layers in this sense it was like trained on ImageNet where was 1000 classes and then you just like replace the last layer with your output right and then you basically train the existing layer you know and there's a lot of concepts as well so there's also a lot of things that can go wrong right because a lot of people you know start to like train the whole layer right and then hopefully hopefully the output weight but you know there's a lot of another concept that you can use to do that to train this there's a lot of libraries so you have Keras you have TensorFlow you have FASAI that who does these things also very easy you have PyTorch MXNet and there's many many out there right and so why why did we do that then so let's just go in the example of TF Keras because that's the most like you know new version of TensorFlow and we are heavy TensorFlow user because we like it there's no certain reason for that but also we look into PyTorch for some reason for NLP but so far we have a good experiences with TensorFlow because it's also made a lot of stuff for production ready code okay what do we do so we import some code and then you know in TensorFlow Keras you can use the image data generator right to load your images do some preprocessing you know target size and then of course do you know train and validation is split this is very good because your generator you know he loads images terribly right so it doesn't load everything into memory at the beginning and then you know what you can do then you can use the functional API from Keras right to define your model you know in this case we use transfer learning with ImageNet right we put in the image input shape 2024 2024 that's normal and then we add a throttle player and then we add a dense layer right so nothing special and then afterwards you know you compile the model you train a model you define your loss right you define your epochs so this there's many things that you can do from from this one example you can already see what what what is the problem as many ways to define your stuff in Keras right so we can use the sequential API you can use the functional API you can use the new subclassing API right so there's a lot of things I mean for for someone who's new in machine learning that can be overwhelming especially for software engineers right because what is this I don't I don't get it all of the stuff right and then you know you have this stuff like test train validation split by the image data generator or you can use cycle learn to do that right so there's also you know different variability of solving your problems and then you know you have the choice of models should I use mobile net should I use rest net should I use VG 16 19 rest net 152 right there's so many out there right so it's not that easy and then you have your optimizer the same thing should I use Adam you should I use SGD right it's a lot of stuff right your last function as well metrics number of epochs 10 15 20 I don't know right it's it's very it's for a software engineer right when he comes in and and sees that he's like wow what is this right it's not it's it's so hectic right it's not easy for in the sense and then of course you can train everywhere right you can maybe you have your own cluster you have Microsoft Azure you have Google Cloud AWS so there's many many you can see many many options and this is only for TF carers right so when you look at other libraries it's the same feeling right like it's overwhelming for people and of course another motivation is expanding VI right so in the past like five years ago something like that you know we train a neural network and people were like okay this is a black box I don't understand that right but over the last recent years we had a shift that we need to understand our models and there's been a lot of research going on in this area right and basically why is this important so we're in very important example that I took here is it's about biases and ethics and biases right so this is a treat from someone a long time ago when Google you know automatically tagged Google photos right and then in this example I don't know I'm sure you can see that it's a black woman and a black man and it's labeled as a gorilla right and I mean we know how gorillas look like right so okay in this case it's a human being but the problem is Google at this point when they train the model they didn't have enough data of like black people or like Asians or something like that right so a lot of data is like you know like biased with white people right so that means you need to understand what is your model actually selecting at the end of day another of course GDPR we love GDPR joking but in GDPR you know you also say hey yeah there's a rule right so you need to basically be someone in Europe can demand from a company especially when you have machine learning and production right so why was the decision made right so what was the reason so you and you have to give them the reason if you don't give them the reason they can fire you for a lot of money right so that's very important that you have it as a human being because just think of in a sense of way when you have a contract like a mobile contract right in Germany we mobile contractor providers they can reject you right because of the model or something like that and you need to understand why right this is very important this is an example that I took from Dr. Wojciech Samek his works for the Fraunhofer Institute in Berlin and he recently give this talk in front of my team and it's about interpretable and trustworthy machine learning and it's an example from the Pascal Vogt challenge so similar to to ImageNet and he asks the question why is this a boat why is this a train and why is this a horse so maybe can some of you tell me why is this a boat some water in it but why don't you say it's both because it's a boat here right okay why is it a train tracks okay well that's funny and why is it a horse okay that's that's actually very interesting because actually when we train a model a model should define that this is a boat it should not say it's water right it's the same thing here in on a train the model should say this is a train and not because there's rails right it's the same thing with a horse so the horse should be recognized and not like you know human on the horse or something else and what did the models or all they like over the time you know it was a challenge over the years and all the models that perform they perform pretty well but then they realized when they put like you know interpretable machine learning on that that machine was learning really weird patterns right so if you took take you know a little bit of crack cam here so the model was learning this is a boat because it's not a boat because it's a boat because it's on water so they had a lot of pictures that boat on water right the similar thing to to the to the train so it was on rails so that's why it's funny because because you guys said it right because but actually the model should learn the train and not the rails because a rail is not a train right similar thing to the horse so it didn't learn a horse that I'm not quite sure you can see that but it learned the caption here so there was a lot of picture with you know like this caption you know and basically it learned the caption so every time where it was a caption it is a horse right and that's pretty important to understand because a lot of these technologies goes into critical systems like self-driving cars in the future right so if if the model you know learns there's a caption there's a horse and you know I just put a caption somewhere you know I can fake it and then you know the car can just kill the horse or something like that right so this is a pretty interesting thing when you just look at that and you know see what the network is actually learning of course CNN's these days can be explainable so I think when you visit Tapania's talk yesterday you know you saw that different techniques we also did a lot of work here as well especially for vision attribution techniques and visualization techniques so attribution techniques like this is one example crat cam basically it's a heat map right and it defines like where the highest activation in the future maps is right there are further methods like salency maps LRP so layer-wise relevance propagation methods right but for me I believe that crat cam is at the moment the best one because there's a recent paper also released by Google where they tested a lot of different like of these methods and the most stable one is crat cam right so but the area itself is evolving quite fast another method is like visualization techniques or feature visualization basically what it does you know basically you have a neuron right which is differentiable to input image and basically you start with some random noise random jitterer and then you start to optimize the input image you know according to that to see the highest activation in the neuron and then you can see maybe some shapes here right at the at the beginning maybe you know very low level filter right and the end you know you you see more the shapes that would define this this object okay let's go to image ATM so the library itself so let's revisit the problem again very simple you know in a in a tagging problem what you do is you have to label the images right and then you this is your input data and from the input data you have to do some pre-processing or processing right like for example image application or something else and then you put in a model and then the output and then of course since we are an industry we also want to deploy it right so it's not a research project so we want to use it in our product this whole thing was when we started super manual right so I think overall it took us like the first setup took us probably a day one or two days right and the next thing was like one or two hours right I mean that's pretty fine you know if you have time but we're software engineers also by heart we don't want to do that right this is like it's very mundane work right and it's also a soft problem so when you solve an image tagging problem I don't want to solve another image tagging problem because it's very boring for me right I want to solve problem is that is interesting right and the initial idea was of course we cannot you know take everything already right so I have to do step by step and what we did is you know we we map it like okay let's let's tackle the input processing modeling output first right because the labeling part is also very difficult because for the labeling part you have many concepts as well right whether you are internally building tools to label your data with the people right or whether you using mechanical Turk solutions right which is also not very trivial I know what I'm talking about it's not that easy and also deployment deployment is another very very very difficult task right when you think about deployment you know you need to think of where do you want to deploy it right how do you scale right so are you using Kubernetes for do this how do you monitoring right security many many things a lot of people think that you know that coming going from from research to production like is is difficult but they haven't seen after production right because the production is also very very very difficult and usually the case is how like how should it look like you know you have some images you put in some folders right so this step is can be very tricky right because a lot of people when they put into different folders how do they do this right you have folder you have subfolder sometimes people put all the images in one folder right so we also need some kind of abstraction there right and then afterwards you know you have some augmentation right so for example you you allow your data set you do image augmentation you rotate your size right and then you of course in machine learning you know we what we do is pattern recognition right that means your your distribution needs to be in a good way right so for example if you are imbalanced data set it will be a very tricky right so you also need to take care of it and then of course since it's machine learning right since it's not a specific close model we also need to understand how well does this model generalized right so we need to put it in a test train validation split and then afterwards you know could be the case you have all the carers right some kind of autumn ML because I think and that was a very nice court yesterday from Leonard we said when you want to do hyper-remate our optimization hopefully you have a student or like a cred student or someone else right because I actually don't want to do that because it's very mundane you know just sit there you know changing some numbers way 20 minutes later to see a change it's nothing that you would like to do like if you really want to do really like interesting work right and then of course afterwards you know you get a model you know and then you want to interpret it right because it's very important for for us to understand as well so what we built basically is covering this thing it's called image a gem and basically it's a it's a it's still new right so it doesn't mean we cover all the things but I just going to go through what it can do already today and maybe you know where you can also contribute it in the future so what is it a gem is supposed to be a one-click-tool solution for you know everyone to do just much classification you know and in a very opinionate way so we really need to like you know we really defined how the folder should look like how the data should look like you know how we do the validation of the images right whether they are real images or not right and then of course what we wanted really important for us is we training a lot on the cloud right so for example you know it's not that easy to to train on the cloud if you think about this because for example let's say you train on AWS are you using both of three you know as a library to to to kind of orchestrate it or like or terraform any other tools right so it's also not that easy and then training and model the valuation as well right it's available on get up so you can actually check it out it's compatible at the moment with Python 3.6 and it's using still TensorFlow 1.3 so we are we haven't used TensorFlow 2.0 at this time because it was still in alpha mode right so there was a lot of bugs in there but I on the roadmap we only want to migrate to TensorFlow 2.0 in the future how do you install it very easy like most Python user pip install right so we made it as easy as possible and of course you can also just take pleading edge as well you know from from the source code and in terms of usage we defined two like options right either you train it on command line or without the command line like you can use a Jupyter notebook or Google Colab right so this is a Google Colab or the Jupyter option was very important for us because sometimes we have special problems that where you know where we still work in that and it's still very easy for us as well right yeah and I just want to show you a little bit of demo it works so okay okay this is this is a documentation and basically you know you install it right and we also you know we have an opinion way how you should define your your folder structure and we we had a lot of discussion about this because you know this is this is crazy but basically it's very easy so you you just need all the images in one folder you need a JSON file that basically and you have your classes and your image ID and then also a config file so this is basically what you need to have and then in the data JSON you know you need your image ID so with this direction for your path and the label right so we chose that particular to do that because we also want to version control the data right so this is a very important thing at the end of the day because where when you're experimenting right some people they just don't version control you know they just put the data in there and then next day all new data comes in but this is a different model right so which shows that way because you know you will still want to version control on the data JSON in a different way and the data itself is thought somewhere in S3 right so you don't really need it at the end of the day and then afterwards like this is an example where I downloaded cats and dogs data right and then basically wrote a little helper function to convert my file in this format here and then afterwards what I do so when you can see here this is my photo structure and when I just go into the config file JSON I just need to define the image directory so where it is sitting so my data is currently in cats and dogs and train data and then you need an output directory and the output directory you get the model and also the model output and the variation and then we have basically three classes very simple in this case we have data preparation right so and then we have a option why to run this because sometimes you just don't want to run a double preparation because your data is already prepared right and then you know you have the your sample file your sample file is the data where it relates to right and then at the moment we have only the resize option which means you can only resize images because at the end of the image classification you just don't take the full image right because the full image will not fit into your memory and GPU right so you want to reduce your file and then of course you can train and evaluate at the end of the day so everything is true right now right and when you just go into the data JSON file so nothing spectacular just image ID and the label right and what you can do now is we have image attempt now also directed in in in our CLI right we call the pipeline we can also call the epochs dense right and then we can in this case I only like one epoch dance and one train all right and then I just put in the the config file and basically it loads in the images now let's wait a little bit I get a lot of this warnings because you know 10 keras is changing so we also need to change as well but so basically you can see here there's some image validation going on the cool thing on there is and we also show the split and all stuff so this is very important and we are using multi-parallel processing right so you will you will use all the cores on your machine so it's not like single processing and then it's training right of course I'm I don't have a GPU on my machine here so it's really slow it's not made to train on this one right it's just to showcase you a little bit let's just wait a little bit until this finished training because what I want to show you like after the CLI is you can also just use like a Jupyter notebook right and we made it very easy to train it on Google co-lap so basically you can do the same thing you go in a good co-lap you just do a pip install image ATM right it installs all the requirements and then basically I just download the cat and dog images again right let's go down and then here just use the helper function very small just to transfer the data in the in the JSON file and then as I said this is the same thing you have like three classes right take the component data prep take your image directory the sample directory your job directory you call run with resize right then it resizes the images in this case you know you can see the class distribution you can see the split distribution that we want to choose then we go to training we can also you know define the options have a trained and simple trains are right and then it's just training and so you still see the typical input like input for a carousel model what we do is a special thing so in in a lot of way people you know just like they just train the dense layer right and what we do is like a kick of training so we just start training the dense layer right as long as possible and then we start to train all layers but like with a smaller learning rate right because you don't want to change the the rates that being basically trained already in a transparent case right so it's also a special case of doing that and then we run the evaluation and this is something that you basically get right so you get the distribution in the test set you do the prediction and then you get your metrics right where you see your evaluation precision recall f1 score right and then like the confusion metrics as well like plotted and then what the cool thing is we also like use crack cam in there right so as a method so basically this model is not like I only visualize stuff that is correct right incorrect that means you can see okay this this was dark right true but it predicted cat so and you can also see now okay what was the reason why it was not a cat or why was it not dark right the same thing here the model is not really trained for long so don't interpret too much it's just for testing right so you want a good model you really need to train it for a more longer time and then you know you can also look at for example like correct classes right so for example docky I need to find the ear right same thing here so this is pretty cool so this is pretty fast now and okay still not finished okay let's let's just cancel this one because I what I just want to show you here it's just when you go in image atm and go and help basically what it does it shows you all the stuff which can do right so you can go in further like pipeline right it shows you also what you can change right so it's pretty easy like for for a sub engineer right so we we had to do documented as best as possible because we're not there all the time right and then of course on this example here you can also see an initial example for cloud training right so we are using Terraform in the back end because it's a good tool basically to orchestrate your cloud training because it's also give a state as long as that right and we can you know use the I am role to define that I think you know if you are like using it privately you get all access but in a company context you know you are user with no rights right so basically we also define it for you and then you know you you basically just take your AWS access key circuit key and then there's another class cloud right so where you cloud you take it you take your provider right we reach in inside bucket right and then destroyed afterwards right so this is pretty simple for us so we use it a lot now these days because our seven years you know they just use it they they they run it right and it's elastic that means you know you just don't spend money on things that you don't want to right that's that's pretty good all right okay I'm actually almost at the end so the summary so imagine them actually helped us to to solve a good problem so we do is our train workflow from like a couple of hours to minutes right so it's very very short and very simple now for our users our like some engineers right and also you know the exception enforce people to like you know load the data in the correct way right or you know store the data in a in a way that we want it right because the future we might you know do another training on that because you have to think when you do image classification or any other machine learning is you have to retrain that right so machine learning is like not only one time training right but it has to retrain all the time right and if it's already in a correct format right you you can use it like across a lot of different problems and also when you think about this way if idiato is creating these files in a correct format I go to another unit they create it also in the similar format and it's very easy than for me right um this library is now except set of the use at idiato so they did it for other classes like classes washing machines cameras a camera smartphones and there's a lot of use cases so for example in this case this is a sneaker gallery so it's not paid anymore sneakers is very important because I don't know like Germans like buying sneakers that's why Salon was so popular and basically at no it doesn't automatically for all the sneakers and we have a lot of sneakers so it's a really cool thing yeah this is like a snapshot of our roadmap what we want to do so of course we want to upgrade to TensorFlow 2.0 and take advantage of all the advantages that they have like creating tape you know and all the subclassing APIs we want to take out of ML capabilities so at the moment we are just using our kind of experience to do the training but the reason why we didn't take out of ML is for example if you train it on a on an amnesty to set right it can take up to eight hours at the moment right it's I mean it sounds cool that you don't need to do that but spending eight hours on AWS on a GPU is just a lot of money when you just think that you just take a transfer learning model and do it for ten minutes you have the same accuracy right so you have to think about that right is it worth a time of eight hours for this like you know transfer learning we want to have more integral techniques but we have to still think what is good there we want to have a PDF report output at the moment you can only see it on a Jupyter notebook more image optimization techniques as well right see me supervised learning this is something that is pretty cool just came out this year they are like a couple of papers also from Google brain in a sense way that you take care or take advantage of like you know training on a lot of data right like that's why it's unsupervised first and then you know use it basically to solve your problem in a supervised manufacturer and this way you don't need a lot of data right so because most of the image classification problems you need a lot of data right so data is a very data hungry problem and a lot of people are kind of like working in the area to solve it image application is something that we are also working on it's also coming soon it's not integrated in the package it will be a separate package it is very important because if you just think on Cypher 10 there were a lot of duplicates in there right so and you think about like like duplication images cause a lot of problems right because when you have a train and a test data set and they have a duplication there it will kind of bias your results right and this is a big problem because a lot of people they just don't think about that and of course the reason why also why we are building this there's a lot of stuff out there already but they are not good right so alright this is a team that built it I was not alone right so I'm never alone on that and I think the credits goes to more to the team than me because I was just you know doing a lot of management work and basically also showing you know direction but like the real work you know is done by the team itself of course we are looking for contributors so if you like contribution means also you're using that your company you know your file bug right and you know issues file an issue on GitHub right you doesn't doesn't mean that you also contribute in a sense of way for code but it would be good yeah thanks for your attention that this was my talk hey thank you that fantastic my question was more around now you eliminated the manual workflow that you said for image tagging but what is the quality of images that you're getting in a sense yesterday we had a talk from zoom car where they said you know if images are taken in dim light or in different angles it causes a problem so now when you have so many products what is the process to ensure that the quality of images that you're getting and then comes image tagging is actually good and what part of that is manual yeah so at the moment the part of tagging like the initial tagging is very manual because we the content workers are still doing that right or a part of my team the end of day right and in order to ensure that you can for example do different experiments right so let's say when you like a lot of people like for example for you in reception it's very hard for people to understand the differences between those two right so you can do multiple votes or something like that right and in terms of your other question for the quality of angle and so on usually product images they are really good like it's it's not it's not like you know something that people take with the camera right but they come from the shop right but this data is just unable right so I we really don't have any problems with quality because all the shops itself they have good images because they need it for their own shops as well right hi so is there a good application for this kind of process in photojournalism well yes there is for is for example for one of our companies it's called ice which brings indication they pay a lot of money just to label images right and they are paying this image paying labeling these images already for a couple of years because at the end of day these images also gets sold back to other media companies right in journalism for example getting images doing that you know or the DPA right and they are also leaving this image away and it's very time-consuming and a lot of stuff you can image all right you can label it already right yeah well the moment we're not using autocarrots because of this eight hours problem right but in the future I would say the best solution would be that you take an initial architecture right and then put autocarrots in that I mean you have a different solution I mean that's why you have nas and in ass systems right where you basically hey train a model take this sort model or check you know or you do it iteratively like you start from the first in layer right and then add at this but the first approach when you just add from a like a random layer right and add more and more and more layers and stack under that it can take a long time to find the right model right I think for this kind of problems you should take an initial model and then you know would put on top of that I had a question sorry if when we use the image ATM actually can we configure all the layers in the neural network or it's only the dense layer because you showed only the dense layer can we configure drop-out layers and other layers actually not so because we have to enforce it right otherwise if you really want to have custom layers and do something else I think you should use Keras TensorFlow or like PyTorch itself right because for very special problems we're not using that right because we this problem is only like it's a special problem it's only necessary when you need to scale right so when you need to put it like you know in in many many different categories right doesn't make sense when you just train once with a custom layer thank you okay we're here first okay so the the gentleman asked how does it perform like against state-of-the-art models I mean this is this depends on what you mean with state-of-the-art state-of-the-art sometimes just get 2% of accuracy more right or less of course it's not beating that but the problem is we don't because if we have 90% or 95% it's enough for us right it's it's not like we need 99% yeah at the moment we give recommendation so recognition engineers okay take 20 epochs right and there's also logic behind this as well no so it's not it's not something completely random at the end of day yeah we are running short of time sorry for that actually yeah can he is available anyway yes I'm a baby outside and if you want to slide I already put it on speaker deck but I think the slides are also available on yeah I will publish the sites later anyways yeah thanks for having me yeah thank you