 So good evening everybody. I'm Anurag Mishra. I'm a software engineer at Index. My talk will be basically upon how to generate captions from a particular image. So suppose you have an image. You've just captured an image from your phone, and you want to describe the image. It's very easy for us humans to do that. So if I click an image of this room, it will say that a bunch of people are sitting in the room. But for a machine, for a computer to do that, it's very difficult. So I did a project. One of my projects was based on this topic. So it's a very complex problem to solve. So what I did was I used deep learning approach to generate captions from image. So basically, in a crux, it's a three-step process. First, you have to extract certain features from the image. If you have those features, you have to pass the features into a certain kind of black box, which is actually the neural networks, different kinds of neural networks, the CNNs and the RNNs. And the output that you have from those neural networks is what you can evaluate to actually check whether the captions generated, they match the human captions or not. So initially, what happens is if you have the image, you clean the image so that it's called data cleaning. It's the first step of any data science problem, machine learning problem, which goes into production. So initially, what you do is you resize the image to a particular size, which I had done to 256 cross three pixels for the RGB values. And after that, I passed it into some pre-trained CNNs, the convolutional neural networks. I used the VGG16 of Google along with the inception net. And I obtained the features from that CNN. But that convolutional neural network is actually used to classify the image into objects. So what I did was I removed the last layer of that, and I got the features from the P-nultimate layer, I mean the second last layer of the neural network. So that set of features along with, so for every image in my training data set, I had a set of five captions. So there's a very popular data set called MSCoco and Flickr8k and 30k. So you can just search that on internet and you'll get that. I used that for, so for every image I had five set of captions. So I used the features, I passed the features along with every word of caption, taking into consideration the next word into an RNN. So I had to fine tune my RNN, my CNN, I had to fine tune my entire approach. It took a month. I had to use a lot of very good computational power, so I had to take Amazon GPU instances on EC2. So after passing it from the RNN, I had a trained box of CNN and RNN, which is sequential instead. So if you want to get me into detail, you can just talk and get into the detail. So after I have the trained, the deep learning model, I just pass a newly captured image into the model and it gives me a set of captions, which there's a technique called grid search which was discussed by one of the speakers on this podium. So we can use grid search and fine tune the best caption for a particular image. So once we have the caption, the third step is evaluating the caption. So since I just one minute more, so if you want to evaluate the captions with respect to the human captions, it's not that you can just compare the words that are there in both the sentences. So you can use a technique called as blue, the bilingual language understanding metric, which actually takes into consideration the precision and recall of the sentences. So it takes the precision, so it's a better form of precision and it actually finds out how good your sentences are, the generated sentences, with respect to your actual human captions. So once you have that, you can actually find the blue score. They are very good libraries or you can also make your own programs. So you just calculate that score and I got a score of 53. And the current state of the art score is 58 for Google. So Google was actually participating in a competition called MS Cocoa Competition. So Google and Microsoft, they tied. They got the first position and their actual blue score was 58. And since in one of my projects, I received a score of 53, which is decent enough because I didn't have the actual resources as that of Google and Microsoft. So I have this project repository on GitHub. A lot of people are using it. I have many issues open, which I try to fix. So if you want, you can just go to my GitHub account and check that repository. So my GitHub account goes by the name of Anurag Mishra, CSE, and you can check the my projects. Thank you. Thanks Anurag. It was great knowing about your side project. Thank you. Next up we have Yash Kotak from Mintra, who's going to talk about artificial creativity. Hello, I am Yash Kotak. I'm a product manager at Mintra. Today I'll be talking about artificial creativity. The reason I call this artificial creativity is there's this general notion that machines are really good at math, but you give them art, you give them creativity and not so much. But since we are a fashion company, we have been able to get ample evidences of the contrary. So today I'll be talking about a very interesting project that we recently took up. So Mintra has a lot of private brands, about more than 15 private brands, and we make thousands of fashion designs of our own every year. So we thought, can we probably take a crack at machines making these designs? How would it work out? So we took it as a two month project to figure out if this is even possible. We tried a lot of methods to do this, and I'll talk about the results. So one method that really worked out really well for us is GAN. So GAN is basically Generative Adversarial Networks. Given the time, I won't go much into detail, but GAN has a generator and a discriminator. So the generator basically tries to create new images, and the discriminator tries to figure out if it is real or fake. So does the new image created suit the data set or not? So we just fed our Mintra's catalog of t-shirts to this network for the generator to create new images, and we see how the results look like. So as the model is getting trained, this is the 10th epoch, 20th epoch, and you can see silhouettes of t-shirts coming out, but it's not really looking like a t-shirt design. As you keep training the network, this is the 100th iteration. Now you see, you know, there is a t-shirt out there, but there's a lot of noise, there's a lot of blur. As you keep training the model, as I read the 300th iteration, these are actual t-shirt designs, and these are not present in the actual data set, so the machine has actually created new designs. Now, this is not perfect, this far from perfect, so as you can see, the last one has one full sleeve and one half sleeve. I can't really go and make that. The third one has a hand missing, but that's okay because the t-shirt is there, we just care about the t-shirt. So now I have the designs out there, but the problem is not solved. I can't really make all of these designs. I have to do a good job of curating these as well because only some of this would sell, which is, well, machine curation component comes in. So what a machine curation component does is it's a deep-land model where we put in all Mintra-style data and sales data and train the model so that if you give it a new style, it can predict performance of that style. And that is how we predict performance of these GAN-generated images to give a final design which has a more than 70% probability of becoming a top seller. So we have about 30 designs on Mintra selling right now, which were designed completely by machines. You can't really decipher by going on Mintra if it's a machine design or a human design. And we have been tracking performance and one very interesting statistic is these are doing 2x well than human designs in terms of sales. So selling out at like 2x the speed. So fashion design is another field that is soon going to be revolutionized by machines. So that's all from me. This is a very ambitious project and we are publishing a lot of our work as well and we are hiring very aggressively as well. So if any of you is interested in this work, do reach out to me. Thank you. Thank you. Thanks, Ish. That was great knowing about how machines are helping you create new products. Next up is Shyam Murli Dharan from Ormai. He is going to talk about anomaly detection in web infrastructure. Right, good evening. So I'm here to talk about a project that we completed recently that's almost in the verge of completion. So this is about anomaly detection in web infrastructure. So our objective here, our business problem was we needed to identify, we had to identify why was there an anomaly or why was there any variation in response time or was there an order drop? What correlation did it do with the server metrics and how was it correlated to the server metrics and how was it correlated to the response codes? So this was a business problem that was handed over to us. So basically this was for a major U.S.E. Taylor. So the velocity and the veracity of the data was quite high. So we dealt with around data sets close to 12 GB so that around 8 million records for one quarter. So this multiplied for one year was quite a challenge in terms of understanding and processing all these data components. So that was a major aspect of consideration that we had to do. And then we had to break down the data into different applications and also solve the problem. So the major challenges that we had was it was very unstructured. It was completely unclean. To be, to put precise, out of the seven million records we had close to 50% of it had NAS. So basically the values that we had had a lot of missing values. So we had to either choose to impute them or find out solutions that can be worked around them. So basically we had to find out where are the missing values and how are they affecting our end result? So this was something that we had to essentially try and compute. And there was no particularly achievable pattern. So if you noticed, there were records where we found the response time was quite high. As in there were records, rows of values where the response time was recorded close to 200 seconds. But the effective server values or effective metric values, if you noticed, were pretty normal. It was less than 80% of CPU, less than 3% of garbage collection. So there was no applicable or conceivable patterns that we were able to identify or isolate towards the data. This was one of the major challenges that we faced with the issue. And as I said, the problem that we faced, it was five different applications that were loaded by the end customer. You had your end customer application, you had your customer application in the server side that was loaded with data that connected to the back end and also the front end, the GUI as well. So we had to correlate and we had to find out where was the missing value. So how did it, how did the response time spike? How did it not meet the criteria that was set by the customer? So this was one of the major challenges that we faced. And the harder part was the data was split into three different records. So one of them was the response codes, HTML response code that we received. So was it a web page not found? Was it a web page that is connected? What was the order response? What was the order drop? So all this was playing a major influence or major factor towards correlating and with the end data set. So essentially we had the server metrics and we had the response time or the correlation values that we had to record. So this was a major set of challenges that we faced. So we typically used, we tried different set of algorithms. We first did a very, very in-depth exploratory analysis. So we did a basic level of FED and then end up the EDA to find out where are their specific outliers? So how should, how do we treat these outliers? How do we isolate these outliers? Do we consider them? Are they forming an integral part of the analysis or should we ignore these values? So this was also into consideration that we did. And we applied something called a random forest. So this most of you are aware of the random forest model gave us a very good insight. But before we did, we approached the random forest. We split the data. We essentially broke the data from a four column into a 292 column. So basically created one server instance into the record of the server instance along with the response time recorded and so the CPU and the garbage collection value. So this pretty much gave us a good insight and helped us understand what was the data that was essentially happening, understand the data and how was the response time being violated and what was the essential happenings in the background. So this was our approach towards it. So once we did the random forest, the initial iterations, we had only accuracies of around 65%, but repeated iterations and also tuning the hyper parameters essentially gave us a good control and we had the final validation gave us a false positive of 3%, a false negative of 7% and also our AUC was all the way up to 93%. So this was also validated in a new completely unseen data. So one year data, we built our model with the one year data and then once we finished building the model, we gave it to the client side where they tested it. So there we had an accuracy of around 90%. The hyper parameters are still going for a tuning, but then again, this was our objective to set out and establish. So that's pretty much about the anonymity detection. Thank you. Thanks, Shyam. Do we have Anirudh Shah here? Yes. So Anirudh Shah is going to be talking about HPCC. So over to... Good evening everyone. So HPCC stands for High Performance Cluster Computing. It's an open source system published by Lexus Nexus. It's been in production for 18 years. It's twice the age of Hadoop. It's extremely stable, obscenely fast and I've been using it in production for the past four years and I get tremendous results right from that. So it has its own data processing language called ECL. You can do lazy manipulation. It's a completely declarative language. It's designed from ground up to work only with data. And because the underlying platform was built using C++, it is extremely memory efficient and you can do wonders with it. So for example, today I'm using that platform in HDFC Bank to do their marketing campaigns across the board. So they run around 500 campaigns a month. So they're generating 500 lists. All of that is running on... Like it runs within 24 hours on 10 desktops. Like you don't need fancy servers and all that. This runs on 10 desktops. So it is obscenely fast and you can check it out at hpcsystems.com. I think it's like a fantastic tool for as a Hadoop alternative, especially when you want to churn through large amounts of data. So with HDFC Bank, just to give you perspective, as 50 million customers, we process around nine terabytes of data over and over again for the 500 different campaigns. And this is for the entire base. So any questions I can answer about that. And okay, I forgot the juicy bit. It also gives you a distributed system to serve the results. So you don't have to go and get Redis or some other shitty system. This is built right in. You have a query, you basically run the query and then you can publish the query and there's a built-in REST API to query that REST API endpoint. And then all of this runs on a distributed platform. So let's say your traffic goes from 10,000 to 50,000. Just add more nodes and you're done. And everything is fully integrated. So it has fault tolerance, backup, automated backup, live failover, everything is built in. So I really don't know why people go to Hadoop. I think it is stupid when HPCC is there. Thank you. And it's fully open source. And Lexus Nexus has been selling this for the past 17 years. They make $4 billion in revenue. All the large governments and insurance companies and banks in the US use this and it's available for free today. So I think you guys should go and check it out. And it also has machine learning algorithms built in along with a lot of other stuff. So if you want to do data manipulation at scale, this is it. It can currently scale to 4,000 nodes, I think, exabyte scale. Thanks, Anurad, for introducing us to this new technology or maybe a rather old one, which people are missing out on. Next, we have Naveen Kumar from Symantec 3 talking about deep learning models on mobile. Hello, everyone. I'm Naveen from Symantec 3. So basically the line itself says that deep learning models and mobile network. So the point is, I won't be speaking about how to train and all those things. It's just inference. Like you have a image recognition model which kind of says what is there in the picture and all. So how do we shift this thing to mobile? So which I've been working in from past one year. Concept is you can always take the existing libraries like TensorFlow or Cafe2go, which have mobile versions of it. The problem there is, so there are mainly two problems. And we push something onto the mobile. One is you want it to be speed fast and the other is your entire app size should be smaller. So when you push something like TensorFlow models into it, apparently they run very slow, which is like the system the first guy was speaking was captain system, which takes almost two seconds on mobile, whereas it takes 300 or 200 milliseconds on laptop. So something like that, if you want to make it faster, you can't just take the same TensorFlow code. You have to make it run in the same way as it is running on the PC. Like you either make it run on GPU and this leads to the other part of implementation. Your TensorFlow codes or Cafe codes are not being implemented on GPU for the mobile versions. So how do you proceed with that? This is the implementation part, which is a third problem, which we have to speak about is how do you implement on phones? So there are two scenarios. You have two kinds of phones. One is controlled and on this and uncontrolled. Your control is Apple where your hardware is constant, whereas uncontrolled is your Android where your hardware is different for different versions of it. So what we try to solve the problem for control thing, how we did is we kind of used Apple's GPU process, which is called as metal framework, where they already had convolution. So we kind of implemented LSTMs to make it faster. So the same models, which were running almost two seconds previously with the current changes, were running almost one to two, 300 milliseconds. The thing I want to say to you is, don't go with the TensorFlow or existing networks for mobile version. Yes, train it on the server side, but when you want to deploy on the mobile, just try to build your own things. This is for the speed where you have to have your own implementations. The second thing is about the size of the model. So there are different techniques. The basically the entire point is your graph model or whatever your train model that gets saved is will be saved in float values. So one thing you can do is just convert them into ends and you reduce your entire model size four times, which is a very good thing, but there will be slightly, the performance will slightly degrade, but you have to decide what kind of layers, weights you want to save into end values. So that is altogether a different concept, but these are the two things which you guys have to take care before even thinking about implementing things from the mobile network, I think. And this is for the iOS part where you can already use existing middle framework of iOS because the hardware is kind of constant. But for the Android, you can't build your own custom layers because it gets very difficult when phones keep on upgrading. But the latest version of Android phones, whichever have processes more than Snapdragon 820, you can use them because they come up with three chipsets. One is your basic CPU. One is GPU and another is your DSP chip. So whichever phones are above 820, this Snapdragon gave an API for the applications to run on DSP. So you can experiment around it and it works fairly good. There is like a couple of days back itself and even Apple upgraded their entire API to Core ML, like one month back. So feel free to experiment around it, but don't go with the generic machine learning libraries which are there. That's it. You know what I mean? That's a fairly new approach running nodes directly on mobile. Next up, we have Anand. Hello. My name's Anand. So I'm going to talk about a small tool called Firefly which was built to deploy machine learning models. So that's me. So we're building a data-sense platform at Aurora data and this kind of came out of our frustration to deploy machine learning models. So if you look at it, the problem is how do you expose a Python function as an API for others to use? Well, so what you want to use that is you may want to use the same thing in a different environment. Maybe setting the same thing and that environment is too hard and it's also loose coupling because if you want to deploy again and again, you can do that without really affecting Core using that. A couple of use cases, we want to deploy machine learning model or pre-posting an image or you want to have an API to do a live price check of something, et cetera. If you look at the traditional way how people do that, it's actually very challenging. You have to write in a web application, what do you want to do? Authentication, right? You have to figure out how to do authentication in that. What about data validation? Okay, so you've figured out all these things now but at the client side, you have to go again and figure out how to write access that. So you may need to write a client library as well. So it's kind of a lot of work just to deploy a simple Python function as an API. Welcome to Firefly. Deploying functions really made easy. So all you need to do is just write your code. Take an example. I have a simple square function, the square dot sq.py and the square function and run it, say Firefly sq.square. There you go. You have the API ready to use. And you don't even have to write a client API. Just import Firefly, Firefly.client, give the URL and client.use the same function and pass arguments together results. Behind the scenes, it's actually sends a web request through an HTTP API, get the response back. And as long as your function supports JSON-friendly data types, it just works without worrying about anything else. A more practical example, you want to deploy a machine learning model. So you write a model.py, import pickle, load the model fast, predict, takes a list of features, call model.predict, pass the list of features. You get the result. Since you need to send a JSON-friendly type, what the integer and then send it back. And so use it to run Firefly and you get a remote model by creating a client and model.predict and pass all the features and you get the response back. You don't have to do anything else to deploy this. While you're worrying about authentication, when you're starting the server, say token equal to whatever token that you want to give and when you're doing the client, just supply the same token and that gets authenticated. If you don't give the token, you won't be able to access that. There are a lot of other things that we're working on, supporting other input formats. For example, you want to upload an image and then get something back. So you could use working on supporting that kind of features while hitting waste on type annotations. So Python 3 has this cool feature called type annotations. For each argument in function, you can say that it's an integer or it's a list of strings or et cetera. We're trying to see if we can validate the inputs based on that and trying to add a caching support so that the function is called with the same input again and again. It won't compute, but gives the same results back. And the cool thing is it's an open source library, open source tool, can go to github.com.ro data Firefly and get the code. If you want to install it, all you need to do is whip install Firefly Python. These are a couple of resources. Firefly Python, there's a documentation about that. You can find out more about it. And that's github repository. You can catch me around if you have any more questions. Any questions?