 I believe you know keynote at 5th telephone 3 years ago and that was a big query, so at that time at Google we had just launched big query. So we are in the process of taking internet with the Google and exposing them as externally as services. So exciting times for machine learning. After 2012, I tied into the startup world and tied into e-commerce. So what security does is manages marketing strategies, ads across time like Facebook, Google, display channels and runs ads on behalf of advertisers. So the crux of the problem is how do you sell more products to customers. So what I am going to do today is that you want to the business aspect of thing. Look at machine learning in the business lens because often you know whenever you do stuff and it is very good to us, someone is always going to ask so what, so what, so what. How many times you have heard that so what and it is always part of what does it impact. There will be fly through a lot of examples telling you about how we have used this at sharp and deep to give you perspective. You know at the end of the day it is a shop. Whether it is an eBay or it is an Amazon or it is a Flipkart, it is a shop. And earlier shops used to be something like this and you would have the shopkeeper very eagerly talking to the customers. They would take this, they had IIT NL so completely localised language. You know you would see good shopkeepers talking three different languages just by the appearance of the person. These people, the person multicasting so if you look at how the shopkeeper deals with things he is making a quick judgement based on the appearance of the person. Now this person may be top of them but she is not going to buy, he may be going down. He is going to directly come to this guy who is not really engaging but he wants you, he is specifically looking for something and he is going to try and use that same. Very specifically if you dial into one particular user, now if you look at this user and this customer, I keep referring to him as user but if you look at this customer what are the things you tell me about him? Gases? Or I am saying the right answers, go on. He is going to find out something? He is going to window shopping. He is going to window shopping. He is using Anidas, that's a very nice observation. Right here, you can see Anidas. So things that he didn't tell about this person right here, you probably can wrack love everything about this person. Not everything, maybe not his girlfriend's name. But you can say he works at a company, he is choosing. Just look at his expression, just look at the intensity with which he is trying to pick a product. He is very specific, he is looking for the right product. The fact that he has an Adidas bag means he is not as fast conscious if he finds the right thing he will pay for it through his nose. So a lot of these things you get to see from a customer. So a lot of retail industry has worked that way over the past 50 years, 100 years. Now suddenly the world is online. Now what do we have? It's just a string with some clicks on there. Literally it's very hard to know this, right? How do you actually make sense out of it? Now the way we run ads on companies, we are here to at the end of the day to sell products on them. So if suppose somebody is spending 1 crore a month on online marketing. Very likely the company wants 6 crores in revenue. So there is some kind of a return over ad spend. So there is a ROAS number or there is a cost income ratio, CIR number. All mean the same, it's just saying I put in 1 rupee how many did I get back. So that drives all of online shopping at the end of the day. But the challenge is instead of nice photograph, the Adidas bag and everything what we all see is this, right? All we have is click stream logs from the site and you have to make sense out of it. You have to figure out who was that customer, what can we do with him, what do we market to him and so on. And that's where data, my name, machine learning plays a huge role. In all of the products that I manage, at Google I manage with a lot of consumer products, right? From Google Finance or Orchard to Google Apps to Google Maps. You know the funny part? It is all about data. Yes, there is a pretty UI on top, but it is the data that makes that UI intelligent. If you haven't, I strongly encourage you, a big fan, read blogs from Brett Victor. Brett Victor was a UI designer in Apple. So he designed the interface for the iPod, the iPhone, the iPad, real guru in its right. But if you read his blogs, you see he's always talking about the UI being predictive. Can you do things with data that then you can infer and then show the user what he would want to see and he doesn't know yet. So all of Apple's UI, you think it's so pretty, but it's not a pretty aspect. It's just that there's something in front of you that you want already. So let me illustrate the problem that we're going through from an e-commerce perspective. So here's the back, Amazon, Apple, eBay, Flipkart, take your pick. And there's a back that's going to be sold. So there's a page that is a back. There are lots of parameters on this page, but primarily as a shopkeeper. So as an online retailer, you see that yes, I'm selling a back. There's a back page. So you take Amanda, who's coming to the site. So let's take four characters. Now Amanda is going to come to the site. What do you make of Amanda? Let's see her. What do you say? Okay. Amanda is a back visitor. She visited the back page. Okay. And then you show her an ad, which is very generic, something that says best deals on bags online. Check out these bags. So if Amanda has come to your site once or twice, you then bombard her with these ads. So all the ads that you're seeing on Facebook or Google or even Times of India or Yahoo. You don't want to know how much of research is done behind showing you the best ad. You as users, we all as users, everyone out there in the ad tech world is trying to make the best ad for you at that point in time. And so here, if you just capture the fact that Amanda's a back visitor, all you can get is a boring ad like this. Now let's say they're ready, cannot add all of them back visitors. Again, same old boring ad. So at the end of the day, there is no personalization. Remember the whole talk about how a real shock keeper leads to things just disappeared, gone, right? And so many times you're irritated by those ads that just say the same thing. It's the same old, same message, exactly same looking ad displayed to everybody. Let's learn from the gurus at Coke. So this is probably 10 years ago, 20 years ago. So a lot of things, you know, if you think of online and we think we're at the cutting edge. A lot of this has been solved problems 20 years ago, 30 years ago. You've just got to look in the offline world. So at Coke, it's one of the most phenomenal brands out there. And what they did was you look at these two ads and tell me how different they are. So this first ad here says share a coke with Kate. And it's all about giving. It's about saying do stuff with your friends. It's about altruism. It's evoking certain emotions in you that say share with someone. It's about thinking of you within your friends. The other ad here says think about it. Man set foot on the moon because he wanted to set foot on the moon. It says it doesn't matter what the world thinks. It doesn't matter who's around you. You want to cook. You get it. Right. That's the power of saying indulge in it. So when I said it says you're worth it. It's about you. It doesn't matter what others say. It doesn't matter what your in-laws say. It doesn't matter what your husband says. It doesn't matter what your boyfriend says. You are worth it. You better buy that necklace. You better indulge in that glareally cosmetic and so on. So that is the spirit of saying self-indulgence. This is same altruism. So if you haven't seen, you know, there are a lot of theories. So actually we intersect the words of psychology as much as we do at tech and as much as we do data analytics. Why? Because not one field alone is going to solve this puzzle. It has humans in the loop at the end of the day. So what this teaches us is there are different ways in which you can approach the same person. Again, pointing to the bottom line of you need to know whom to send to. You absolutely need to know how to sell. So what we just discussed about code was how you're selling it. You need to know when to sell it. If I'm in a meeting and busy and you're showing me an ad, it's just not going to work. So you need to time it right. Mobile phones now, you can time it down to the last level of detail. Think about it. Your mobile phone has so many sensors. If I have an accelerometer there, I know if I'm moving and walking, I'm stationary and so on, right? The whole Google traffic product was built around GPS dots. You know, what can you do with GPS dots all over the world? You have timestamps, so then they will have the GPS dots that were static. Oh, those are probably static locations. So people are sitting there. Oh, maybe those are offices. Then we saw GPS dots that are moving very fast. We said, hey, that's probably cars. Then the GPS dots are really moving very slowly. They're like, oh, maybe the cars are going slow. No, maybe not. People walking. So if you have a lot of people walking, maybe it's a mall. Maybe it's a residential area. So in maps, right? Even though it was maps, what I was actually doing was trying to figure out, make sense out of data. You think of malls and think of entry and exit gates. It's a nightmare getting the data. How do you get data for malls and entry and exit gates? The only way we're looking at are all GPS dots entering through one point? You know, within a few meters. Are all GPS dots exiting at one point? Boom, you've got your exit and entry gate. So you have to be very smart about data. And with the current technology, the amount of data that you have is pretty quite a lot. So when the cell, you can time that, right? Just one question I had regarding the maps example. Does Google also use that crowdsource traffic, traffic condition? Yes, yes. So the question is, does Google use that data to crowdsource traffic condition? Absolutely. So that is how Google traffic is involved. And one of the nightmares we had on traffic was, and you can read about it on the Google blog as well, in India, there are no speed limits, right? So when we first trained the traffic set for India, we had speed limit of 60. And we said, okay, 60 is normal and 20 is slow and 40 is this. And Indian traffic just varied. So even if it's 20 kilometers per hour on that Indian river road in Bangalore, it's awesome. It's like super fast. So we had to mark that green, right? Those are the kind of challenges. It's interesting, I used to work very close to that. And it was so accurate that you should tell me like, getting on the bridge, there's a block. And getting down from the bridge, there's a block. But on the bridge itself, it was green, which means also end of the bridge. It's very accurate. That's why I like that. Thank you, thank you. Great. So it was very accurate. And I would test it in that area and stay there as well. So these things become extremely important and they're not impossible. So even though you have a lot of this kind of data, turning it into something that actually you can understand is not that impossible in today's age. So let's look at example of how to do that. Literally at any point in time, you could probably have about 150 signals around the user. And these are pretty much anonymized signals. You know, you don't have anything very personal. You don't have personal info, right? All you have is a click stream. But based on what is being browsed, what is the price of being, that is browsed, just like the Adidas example, right? You are able to paint a picture. I'll give you a bunch of examples of how we do this. So the first one, whom to sell to. Whom to sell to, let's say Amanda came to the site and what she actually did before getting to the bags page was she went to the trending items page. What she did was sort here. She sort said by new and popular. You know, often this is a very important signal on a website. Does a person sort or not? And if you look at these kind of signals, you say, okay, she's sorting by popular. She's going to the trending page. What can you make out of this? And that's all we know about her, right? We don't even know her name. There's a fictitious name. So what we can say is this person likes trending stuff. She wants to be popular. She wants to have the latest dress in town. That's kind of her personal. Giving you some numbers here. So if a person starts, you know, uses a sort on an income site in the last session. The process of the probability that they would convert in the next session and actually buy something in the next session goes up from about 9% to 14%. So if an algorithm comes that is not starting average, you know, depending on the data set here, we had, you know, this was a combination. But this was about three to four million cookies, about 20, 30 million events on a three node, a loop cluster about two to three hours. So not a very exhaustive data set, but gives you an idea here. So for every user, for every signal, literally for every type of signal, you will see some of this correlation. So Amanda here, you can classify her as a trend follower. And we'll see next how to actually message her. Second, let's see this Karla. And Karla looks at these, you know, $10, she would always sort by price or she'd sort by discount. So there's always discount side to know. You must have noticed in all these e-commerce sites nowadays, you have sort by discount, sort by offer, sort by price, sort by popularity. Just amazing stuff, right? But that one answer is totally giving away your own persona right there. Now that is used to show you the best product so that you can get what you're looking for quickly. Often frustration of online shopping is I can't find what I'm looking for. So when you have things like that, if you look at these guys who are looking for discounts, how many for discounts? Often they don't care about what site they go to, they want the discount. And with cash on delivery, you can be doubly sure, okay, no problem, I got up like I did. In fact, now sites like Mintra and all are offering trials, right? Would you like to try this before you buy? Of course. And so they're making it really easy. It's amazing the level of service that we are now getting in India. So if you look at frequent buyers like these, if you have someone tagged in your database as a frequent buyer, you often can see three eggs jumping from originates. This is very obvious, you know, if someone has bought ones or bought keeps buying noise. Often strategies of e-commerce companies in the initial days are to get the first time buyers and then how do you turn these first time buyers into repeat customers? That's why lifetime value becomes very important metric here. If you think of it, say kurtis, right? Kurtis are the first item that is newly e-commerce buyers, mostly females. And the average price rate for this first buyer is about 300, 400 rupee kurti. That is the first buy. But once a female buys a 300 to 400 rupee kurti and hopefully likes it, she's going to spend 6000 rupees in the next 6 months. So those correlations vary a bit from side to side, but there are these correlations. And so how do you do the math? You basically say, oh, if this person, do you say this person is worth 350 rupee because she bought that first kurti 350 rupees? No. You say this person is worth 6000 rupees. Now to market to that person, I'm willing to spend 1000 rupees or take 2000 rupees, I'm willing to get this person. So essentially what you end up doing is you're spending 2000 rupees to get that first purchase of 300 rupees. It's a massive loss for you for that first sale, but you're hoping that the lifetime value will work for itself. So that is the kind of math that you have to do. And this way is my side. I'll get into some challenges on the technology friend that we face in terms of trying to deal with this, where models are always different. So this person that we just talked about, she's a deal hunter. So classify that person as a deal hunter. Then you have Betty. So Betty comes and looks at all the brands. She's very picky. She knows the latest in brands, extremely brand conscious. Often you would also see, you know, there are two kinds of people that you see on Ecom. One that just go for what they're looking for. They're looking for that blue, dark, blue, braided, some accessory. And they just go for it, buy it. In, out, within two minutes they're done. They're not brand conscious. They are more, they're looking for something very specific. The other time where we did some, we keep doing this analysis where we found, hey, people that look for brands are searching for brands are also sorting by discount. Now, what does that mean? Right? You're willing to buy a Calumglaingian and you're now looking for discounts on it. It doesn't quite correlate. But if you really think about it, it does. Statistically, when we saw that we tried to dial into it as users, you know, people want to show off that big brand. And get it at the cheapest price. You know, who doesn't want to? Why? Because Calumglaingian, oh, Vada brand. Somebody is wearing that. Oh, they go Karinagapur or something like that. Now suddenly you want to match Karinagapur. You don't have the money to do it, but you look for a discount. So it's basic human behavior. Data tells you a lot of these stories that you pretty much won't otherwise imagine. So when you think of site search, you know, it's a very important signal there. Think of anybody who wants to search the site, search within the site. This is not searching in Google. This is coming to the site and then searching within the site for something specific. The conversion rates here definitely go on. The conversion rates here literally from about 8 to 9% to about 14%. You'd see a direct price down. And we often ignore these little signals. The person is willing to search, the person is willing to look at. So this person is a fashionista. She knows her brands depending on the high end brands, right? It's not about one or two brands. She really knows the very niche brands. Every brand there is a niche. There will be always a niche one, very expensive. So in New York, you know, there would be a couple of brands that are useful and so on. The last one, Dan. So Dan, you know, he does certain things like on the shopping cart, he always checks gift wrap. Or he would end up saying, okay, give that one rupee donation. You know, these kind of guys are very altruistic in nature. Often gifting items, often gifting items for friends, family, and so on. You'd find a category of these items. Folks that are repeat buyers because they keep buying for others. But this person is often spends a lot of time on the site as well. So time spent on the site. So beyond these signals, you know, is that a gift wrapper check or not? Very great signals to look at. But if you also look at the times they spent on the session, in the session, less than 90 seconds should have an average conversion rate of about 6%. You'd have less than 10 minutes of the spending on the site. You would see a bump up in conversion rate. So, you know, they've gone up from the one and a half minute and now it's looking at around, say, seven, eight minutes of the session. You'd bump up the conversion rate to about 9%. There has more and more time come up here. So an hour on the site, probably 10%. Then it kind of rattles. So the guys really want to buy. It's going to buy in about that 8 to 10 minute time period. You tag another general store. So instead of that, you tag another general store. You tag another general store. Now suddenly they are a little more informed on who came to our site. Right? That's how you deal with data and do this with data. You try to form those signals from data. Now the next part. Okay, you know about this person. Now you know exactly what they're looking for, how they behave, what they want to buy. The other person is how to sell. So how to sell depends a lot on the messaging that you use in your. So let's take example of Facebook ads. You go right hand side ads. Amanda was a trend follower. You want to give her messages such as everyone's buying these. You know it's a trend. You want to showcase their trends. Everyone's buying these. Amanda, what are you doing? So bright colors like that are a fashion must. We use these terms fashion must. Everyone's buying these. So at the end of the day, these also end up as templates. So what we have are all of these ads coupled with these personas. There are templates at templates for every type of person. So there may be hundreds of personas for each type of persona. And actually there's some research in psychology. There's a limited framework, which says there are no 200, 300 types. There are only eight types. So the eight fundamental behaviors that anyone of us can get classified in. And this has been a long, long lasting thing. The brand's used very heavily. So we've also tried it with that and it works pretty well. So you take eight types of personas and you say eight templates that I'm going to create for ads. Now at the end of the day, you want to do creative analytics. So what type of creative work for what type of persona to sell what kind of product. So this is a machine that we have to keep churning day in, day out to ensure that the right product gets to the right guy at the right price. For the karma, the deal, you have something that it's 40% off on some big brand. So the letters always correlated with brands to throw the brand off. So make them happy. You're getting a good brand at a low price. That works very well in terms of ads. The fashionista, you want to throw names of big bags. So let's say in India you would have say bags for high design. It was slightly more expensive. 3,000, 4,000 rupee bags started that. And typically someone who's very picky. Someone who really knows fashion. Someone who knows what's going on in the ramp there. And lastly, damn, you want to showcase others. So shower her with love. Gift her this. So all of these kinds of messaging can be used for people who fall into this gender soul type of bracket. And literally there are over 100 such person that we've created over the past couple of years on this. Now you have, what you get at the end of the day is these templates and how much it matters for the same product. So if you take earrings, you want to sell earrings. And the first reaction is okay, earrings I'll sell to females. But 70% of people buying online are men's. What do you do with that? So you say, no, no, I'm just going to take 30% females. And then what do you do for females? You say, this, this earring was made for you. Indulgent. You have to get this. This is the latest in fashion or design and so on. But one week that we did, we said, okay, let's try males for a change. Males, that's not going to work. But what is the messaging you use? You target married males, say 32 to 38. And you say, when is the last time you gifted your wife something? Works brilliantly. So it's messaging at the end of the day that turns the entire table in terms of even if you have the right target audience, you have that same married male whom you could have said, oh, buy this necklace. It doesn't work. But if you say, when is the last time you gifted your wife something? Bingo. So all of this analysis helps in actually getting the right message for the right person. So four different messages for four different types of people. And of course, this expands into multi-duals here. Then coming to some real-world examples of when to send. When is very important. So the best example I can think of here is at the Super Bowl. Last year, I was in the real before last. There was a power cut. So Super Bowl is this massive World Cup cricket equivalent in the US. And the ad slots are really, really expensive. The whole marketing that goes behind it goes into the billions of dollars. Lights went out or the game stopped. You know what Twitter, what Oreo cookies, they tweeted. They said, lights out, no problem. You can still dunk in the dark. And that one tweet, right, literally within an hour of this getting sent, 15,000 Twitter tweets, 20,000 Facebook likes. It just went viral. Getting this viral often hinges on timing. So timing has to be right. You have to capitalize on that timing and no better example than this one. In fact, at the end of the Super Bowl, they said, you know, blah, blah, blah, blah, entire cup because they marketed it so well. The way we use timing is still at same if it's in front days. There's so much to be done on timing, you know, precisely timing it when you're getting out of the building. Ideally, precisely timing it when you're about to see something about to embark on a new task or during your downtime between two events that you're doing. But so far, you know, things like booking flights. You see a pattern in person booking flights. You say, oh, 25th of every month, this person books flights. So maybe he books, does he book business class, books class, does he book regular economy? Does he book with one child or does he book, you know, three at a time or does he book always one? A lot of this tells you whether it's a corporate traveler or a personal traveler. If he's a corporate traveler, you want to then, you know, then time it right. You want to say discounts coming, you know, save your company money and so on on the 24th. You got to find that pattern. You also say, you also see a little spike around the first to fifth or seventh of the month when people have their salaries, youngsters, right? No responsibilities. Lots of money in the bank, working in IT. They can blip, you know, buy some electronic gadget. So that's your order. So when would you market to a young IT professional? Not on the 28, not on the 29th, but on the 1st, but on the 2nd, right? And they actually have that money. So timing at this level definitely works and works brilliantly. Other one, you have retailer, you know, you think of seasons when you teach kids all season, you know, summer, winter, rainy, I mean, three seasons. But in the shopping world, then no less than 30 to 20 shopping seasons. They've turned everything into a season. Back to school sale. Now, monsoon sale and every little event is a sale. So you have to capitalize on that. Because this seasonality needs to be captured. The reason I'm tossing all of this, keep correlating it to the machine learning problem at hand. Now your sales are going to vary by the season, by the messaging you make, by the timing that you make, by the product that you are selling to whom you are selling. So the number of parameters here, right, across domain, these are very, very disparate parameters that correlate to make that sale happen or not, and varies by land. So the way a discount site behaves is very different. So the way a group on would behave, that is around coupons, would be extremely different than the way a group card on a java would behave. Because there are very different types of people coming to that, those two different sites. How do you correlate that? So literally there is no one grand machine learning algorithm. So from our technology point of view, there are challenges in terms of how do you keep training models for different clients specifically. You can't just aggregate data and then there is no uniformity across the signals across the client either. So that's not an option sometimes. Then coming to one of the most important parts which is what to sell. We said okay, we're selling that good, we're selling that bad. But if you really think of it, what to sell, the best example is, I don't read, but whenever I go with some friends to Hard Rock Cafe, I'm the designated driver, but then there are some of you have seen this holster guy at Hard Rock Cafe. The people in our office are very big fans of this guy. So there's this guy at the holster. The holster is where you carry guns, right? Something like this. And he has a stick of shots in it. He has those shot glasses in it. And his job is to figure out if anyone is going slow on their drinks or not drinking enough or not ordered enough, his job is to upset. He just walks there and says, he makes a big noise and says, shots anyone and just takes out two shots. Just looks like a cowboy. Brilliant experience. And exactly the product that you would say quickly, yes to. And it's one of the more expensive prices I think on the menu, right? So what to sell at that point is extremely critical. There are complete examples of this. So the way we use what to sell signals are someone's bought a coupon or someone's bought something around gymnastic, or jimmy. Somebody's got a coupon for jimmy. You know what, everyone buys an after buying gym membership. They buy shoes. Many of these gyms have rules. So you have to have separate shoes that you come in and separate shoes inside the gym. Or for some reason you just want to refresh your life. Okay, I've joined the gym. I've paid that money. You have to go because I've paid the money. And now what else can I get? Okay, I'll get nice expensive shoes and you go with it. So a person has to be followed with what product to sell next. You can't say I sold this product to the customer and I'm done great. I sold it and I'm happy. You have to find the next one. A traveler, the person buys flights. How irritating are those ads that show you, oh, flight to Maldives, flight to Maldives, flight to Maldives. I bought that ticket a month ago. What are you doing? Instead, show him what else to buy. What else in Maldives? Show him discounts on some buffet lunch in Maldives. Nice cruise there. So you have to keep upselling on different things. An interesting correlation we found was dinner for two and all these kind of coupons have an immediate correlation with something else silly as death of a flask, death of a checkup. So you're going on a date. You really want to look your best. The worst thing you want to have is black teeth, black bread. So that is dental clause. More true in international countries. The correlation is really huge. So even this is not limited to the economy. Iron Maiden, I mean a few fans. So Iron Maiden fans, right? Iron Maiden had a company, Music Metrics in the UK actually run through analytics. So they were broke. They were not doing really well. And now it's arguable whether they commissioned it or somebody else that did this data mining and gave it to them. But nevertheless, this data mining was done. What they did was they looked at big turn data to find which countries, which cities, whether songs getting downloaded the most. So they were facing rapid piracy. They were, you know, they're not making money out of it. People loved them. But they were not making any money. What did they end up doing? Doing this analysis. They found out which cities, is there pirated songs? Are there pirated songs downloaded the most? They found places that they had not thought about, Sao Paulo and Brazil. Different places like these that they wouldn't have typically imagined. You would go to New York and go to Chicago and so on. But you are Sao Paulo and Brazil. Like sure, let's follow this trail. They actually followed this trail of cities that just were sorted by something as simple as BitTaurant type of addresses by location. And boom, they were one of the most successful, financially successful brands in the last decade. Little, little things like these. You just look at your top countries, your top locations. You don't even need to dive into so much detail and click that sort by popular or not. Sometimes even a sort by location is all that it takes. Target had done this many years ago. It had got mixed reviews because it was like, dude, what are you doing? This is too private information. But from a data value perspective, what they did was pretty smart. So what they did was they had a registry of females expecting a kid. And they said, okay, register for this registry. And then they gave them some free voucher or gift card. And then what happened was they kept buying things with that voucher. So suddenly Target had a list of shopping carts for that person. If you notice big bazaars doing that smartly, I don't know if it still exists, but they have that mobile phone recharge, free recharge. You buy a big bazaar and then get free recharge on your phone. What it is essentially doing is tagging all your shopping cart purchases to one phone number. So you're being identified for you. It's pretty smart from a data and personalization point of view. So what Target did was said, let me take all pregnant female shopping carts and run some mining on them. Let's see what comes out. What they noticed was patterns like unscented was a keyword amongst all their purchases. Unscented because I remember when my wife and I were pregnant, she couldn't stand the smell of anything too strong. It just happens in your first time and so on. So you tend to buy unscented lotion, you tend to buy unscented soap, which is typically uncommon. Normally you want scented soap. You want something. Here you buy unscented soap. So what they inferred was anyone buying unscented stuff is expecting a baby. Let us upsell them with more baby stuff. So while the whole part it worked, it just doubled or quadrupled their baby item sales. They basically started, people found it magical. Hey, I'm looking for this diaper for part. It just came in the mail. So nice. For most customers, it worked great. And this is something shopping cart analysis, basket analysis there, works like a chart. One last thought on the product side is you don't want to always bombard the products. You don't want to say, hey, this product, this product, please buy this product, buy this product. You want to also engage with the person. So today's sales is a complete journey across things. So you have search where you have intent. You want to quickly search for something. You have Facebook where you are going to write impulse purchases. You can never sell a TV on Facebook. You can sell TVs on search. You can sell equity on Facebook. You can sell accessories on Facebook. So impulse purchases versus intent purchases. Very often it's a combination of these two. You can drive an impulse with Facebook. The person forgets about it. Then slowly next time it comes on search and tries to search for that same item. So what you want to do is build a relationship with the person that the person keeps coming back to your site for more. So somebody who has bought something around food, given an article, how do you use a ship's knife for the first time. So if she's bought a knife, show her an article about it. So get more involved with your brand. This is another technique that works. In building, lifetime value with a customer. So overall you saw all of these four things. You can derive the data. You can use the data. It's pretty much in its infancy to what it could potentially be. It really would be the extremely personalized world that you would ideally want to live in. So this is just a start. Now I'll spend a few minutes on the technology part of it. On the focus of the talks I've been in business, but I wouldn't be like if I don't give you some scope on the technology front. We deal with about hundreds of billions of pixel fires every month in terms of, we're into hundreds of terabytes of data. Not yet in terabytes. So in the initial days my SQL was okay. You start my SQL, or you cluster my SQL. All of that you keep because everyone knows my SQL. You try and most companies start like that. Data, my SQL, easy. Little update there. Mongo, when your data structures get messed up. So we use a lot of Mongo and my SQL. And then we start using radius quite a bit for caching. Now that stack worked well couple years. Until we started seeing more and more data. So over the past one year we've literally gone from about 40 advertisers to 4,000 advertisers. Now not just taking the bigger but also taking smaller, dentists, restaurants. Now how do you manage 4,000, 5,000 advertisers, even semi-imaginary? Not possible. Has to be automated. So as we get more and more of, have to handle more and more of these things, automatically switch to storing a lot of this on say, Ardup. HBase, running map producers. Terrikey streaming, side data on Ardup. And even that's okay. Works well. But we also started hitting the limitation on other fronts. So for example, active MQ. Things that you don't think would break, start breaking. Active MQ not very good at scaling. You are now in the process of switching to things like Kafka. So Kafka is something that linked in open source. A beautiful message broker system, very distributed, works on scale. Active MQ does well in terms of latency. Low latency and that's what you want for your message service. But really cries out on throughput after some point in time. The other thing that we realized was, as we think of map producers, in fact you may have seen who said Google also mentioned this publicly. Google doesn't use MapReduce for most of the things now. It's moved to Flume and Millway type of systems. Now what's Flume and Millway are equivalents of things like the strong topologies, the spark topology and so on. So if you haven't explored, we are in the process of kind of making that move, exploring things, evaluating these, benchmarking these systems. The basic difference here is, you know, Hadoop is great for batched processes. But if you think of doing things when things are streaming, you have a click stream coming in, you know, if you don't act in the next five minutes, the user's gone anyway. How do you deal with that? You want something that will process things when they're streaming. So Star, who was a computer and then open source or created it for trending topics analysis. So it basically tosses things in a bunch of what they call bolts and it is, you know, you pass from one to another, another to another and process happens. So it is not the traditional MapReduce way in which you would think, but somewhat different in terms of how you are able to deal with streaming data. Then come things like Spark where you say, oh, Star can only deal with only streaming data, Spark can deal with both streaming and batch data. So you can also correlate your streaming data with something that is historically done or there's a batch process running on it. So these kind of differences, right? You start hitting the more and more weird type that you get. The other challenge that we do face is, yes, there is a fancy machine learning algorithm and something simple, say R or in Java and so on. But here we want it to be really inherently running part of a MapReduce. Now, if you think of, say, some initiatives like KMML models and all, there are libraries that can be thrown on Hadoop and you can run it part of MapReduce. But I remember it was so simple at Google. Why? Because every possible data mining algorithm out there was rewritten in a MapReduce form and was available for use. So as an engineer, you never thought of it. You just said, I want to use NDA and it would be the most efficient implementation in a MapReduce possible. So outside that doesn't exist. But there are things that are needed there. We need to figure out how to make these, maybe not fully online learning, ideally yes, but at least batched online learning. So every two hours, every four hours, at least every day can we retrain a model? Can we make that so personalized that it can continuously learn from the data coming in on a minute-by-minute basis? So a lot of gaps in those technologies that we are trying to now hunt out. If you, you know, any, I'll take a couple of questions if you're out of time. But strong, strong call here, shameless call for please join soccer team. We're doing cool stuff. We really want help. If you like machine learning, we want to jump to the next stage. We want to skip this whole Hadoop part and instead of reinventing Hadoop, we're trying to fix things there. We want to jump to the next level there. You've seen Google's Datablow. There's Amazon's Kinesis. There are data pipelining on Amazon. Pretty powerful systems that we are trying, we are in the process of moving through now. So questions? Yeah. Hi, thanks for the great talk. First of all, this is the second time I'm seeing a first person. So long time. And a lot to talk about. But the second part is, you said something about when you were acting in the first part. You were saying somebody comes in and I'm just following some patterns. How do you handle animals? What if somebody comes for the 300 rupees one you said and then you try to spend a thousand rupees because somehow you're able to analyze that and it's she or he will be in 6000 rupees. Yes, yes, yes. It might be available at some point. This is not very good. And I'm working on something which is actually very high. How do you deal with this? No, great question. So the question is, you know, I'm throwing out some assumptions and throwing out some learnings. But clearly there will be 10 people who don't fit this bill. So you're going to spend 2000 rupees with the hope that this female will come back and spend 6000 but she doesn't. She just keeps hopping different sites and buys 300 rupees things. So how do you eliminate that person? No great way of doing that other than figuring out outliers around certain data. So there will be certain time that we are not there yet. But there will be certain patterns in terms of the the one one pattern I can tell you top of mind is the source that they come in with. So if they come into an affiliate click, what an affiliate is is those clicks on blogs and website that trick you into going there. You're watching something else, you know, boom, suddenly something else comes up and you're on another site. Or then it's free Wi-Fi. Free Wi-Fi, click three links and then you get free Wi-Fi. A lot of those sites are called affiliates. So they get, you know, 2%, 1% of the sale value if you said. And those are the ones that are these one night kind of things where they just come in, go and you're done and you never see them again. Those are the alarming. So you have to figure out what signals correspond to those anomalies there. You have anomaly detection even in terms of clicks. So negating what clicks are bad in the display world, there are a lot of bad clicks, fraudulent clicks, bot clicks. It's a whole world in itself for anomaly detection. So the nuts are in the attorney. You said that you get a large piece, like a particular user is doing this. So if UI is helping you out, go get it. For example, UI is helping you out, go get it. For example, you're asking certain questions. Yeah. Like there are more than 10 number of dropdowns created. Yes. So that your data will be sorted out perfectly. Yeah. You can get it. Yeah. So Ari said about 10 to 15, like she's just buffing on and she's not doing anything. Yeah. Why not put in some UI in order to classify these kind of users and that can be as good as management? Absolutely. Absolutely. So one of the emerging branches of machine learning you will see is some people have an innate understanding of data and machine learning and at the same time have an understanding of how the UI impacts this. Those are golden today. So typically there will be machine learning and even machine learning gods are very hard to find. You know, amazing crowd here. But you would at least find 100 people who know machine learning. You would find maybe two people who know a combination of machine learning combined with, say, how it impacts UI or machine learning combined with how it impacts business. So definitely these are exactly the areas to get into. So the question is, you know, this technology starts, you have to set up your servers. You have to maintain those servers. Have you focused on using SAS services? Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. Yes. which is great. So you would evaluate Impala and you say Impala stack, I'll put Impala stack really fast marries and so on. But then we have evaluated Redshift and pretty impressed with it. So Redshift for example, joining two tables, one 20 million rows and one 6 million rows and a very complex join with move where it was. So literally scan through both tables and give me the answer. It was literally two seconds and we were shocked. It's nowhere close to what my sequel. It's a huge improvement. But even if we wrote a market use for it and it would have been longer. So there are very good quality things. So there is a big query from Google. There is Redshift on Amazon. There are some, yeah, Saab Hanna very expensive, but Saab Hanna is a very interesting one. So they do everything in memory. So Hanna was very much actually Hanna. I was at a conference where they launched it and they were in Berlin. So the story goes that CEO or CTO was in Berlin and they said, you know, it's suddenly getting hot here. If I could detect this drop in temperature and quickly fire a query through my ERP systems to ship more ice cream trucks towards Berlin, that would be perfect. That was the business problem they were trying to solve. Yeah, that's a great one. I think expense keeps going higher and higher. Redshift and all big query Redshift are very easily priced for what they do. Yes, yes, yes, yes, yes. No, very true. So the question is this map seems flawed, right? Everyone trying to run on a discount. They are now spending over in marketing just hoping for customer acquisition. When will those numbers really plateau and get accurate so these guys don't make losses? It's a tough business. So there are 20 retailers with really good products, why for the same customer? And right now, unfortunately, it is in this discount base. Hopefully it will get out of it. Hopefully it will evolve to people picking stores because of service and longevity rather than, you know, discount. But right now, unfortunately, it is in this discourse. How we can realize the cost of the service? You raise, have you ever seen this deal, six times cost clear than the cost of the service? Oh, you're going to apply the same analogy to IT costs. So the question is the cost, the benefit of this analysis may be lesser than the cost of actually analyzing it, setting up the service. Actually, it's given by scale. So if you're going to do this for 4000 clients, you have to do this setup once. You also move on to things that are on demand. So say Amazon AMR, Elastic MapReduce, you only run it when you need it. So you move to these on demand pricing models where your costs are minimized. So yes, cost is a huge aspect of it. But it's getting down. It is much better than the Ecom story. It's fairly logical because all engineers are dealing with it. Yeah. Yes. So the question is, you know, this user system, then there are recommendations and so on. So we do some of this, we kindly focus on the ads aspect, but the technology that's built for it is very much applicable for you and recommending products on the website and so on. So we're in this phase of creating IP that can then find a lot of UI that can, you know, get out there as products, but along very similar lines. I think we're out of time. Okay, last question. Yes, yes, yes. We do. We do. So we do. So depending on the customer sometimes, you know, let's say HTML ads are displayed. Customer is viewing an ad. Maybe we don't have any class history of the customer. But what we do know is it is being shown on ecom. Maybe we know that it is being shown in a Chrome browser. Maybe we know it's being shown with the screen resolution. Maybe we know the background of the page is blue or it's white. Little information like this can be used to personalize it. Not on social media. So social media primarily Facebook. So that is on FBX. So when you do Facebook exchange in real time, you could do that. The other one is around, primarily HTML ads are displayed. So when you have HTML ads, because HTML is so powerful, you dynamically fetch things on the server. So you render a container and then you say now one line in this, I want to fetch from the server, that is personalized for this guy. That works as well. Thank you so much.