 Sorry, our next speaker today is Rahul Kulkarni. Rahul is the Chief Parag officer in Kulkarni and I think it will be quite an interesting talk about how targeting is done and how technology is done. Thanks. Alright, let me go on. So, my name is Rahul Kulkarni and I am the CPU at Kulkarni. Kulkarni is a digital marketing and analytics company. So, we are about a four-year-old startup. Our DNA is very much analytics engineers. So, I started by three Amazonians who returned from Seattle, set up base in Pune and started Socrady. I joined from Google about ten months ago. So, whatever we do is very much analytics. Most of the large e-commerce players, finance players are our customers. So, what we basically do is we run ads on Google, we run ads on Facebook, we run ads on display networks, on YouTube and so on. A lot of experience in ad targeting and where this comes from is, you know, it's really hard telling wins apart and I'm really hard with differentiating faces, especially in weddings when somebody comes and recognize me and I'm always completely blank. Ad targeting is so much more an easier way to kind of differentiate between people. Forget the names and forget the faces. We can deal with cookies, we can deal with behaviors and those are very, very distinct. So, what I want to do today is walk you through three things and we have a short amount of time. So, I'm going to stress on three big pieces. One is, how does the whole cookie story work? How many of you are familiar with how ad targeting works, how cookies and ads work? I'm sure most of you would be. Let's see, I'll keep stretching the level of difficulty here and give you some hope to give you more insights on how the cookies work in terms of the way ads are targeted. Second part I'd like to give you a dose of is, what are the variables involved in ad targeting? Now, yes, there's one cookie, but based on that cookie then you're going to derive a lot of other variables, whether it's the color of the ad to the interest of the user and so on. So, let me give you a run-through of what those variables are. And now when you toss all of these variables in the mix, that's a big data problem. Lots and lots of data. We literally have two billion records getting into our database every month. So, it's a huge amount of data. How do you, this is not about just getting the data and being happy. This is about now using that data, processing that data to actually target these ads to the right users. So, that's the most important step. So, I touch a bit on that in terms of how does the targeting happen? How do we manage this big data? What techniques do we use to give you a small sneak peek of that piece? So, to start with, simple question. So, here I go to end gadget and I'm reading through this blog and then after a couple of minutes I switch to TechRunch. And then I see the same end gadget ad there. Anyone tell me why this is happening? So, this is retargeting, very basic retargeting which is I go to end gadget's website, end gadget throws a cookie on me which TechRunch then some ad network there has this container which looks for that cookie saying, hey this guy was an end gadget, it's the same guy, it's the same guy and shows me an end gadget ad. So, that's how basic retargeting works. Now, let's take that a step further. And here let me give you some detail on how much you can target with retargeting. This screen, anyone familiar with this screen? This is from Google Analytics Remarketing. So, this is part of Google AdWords, a very powerful tool for remarketing. So, if you throw your analytics and remarketing cookies, you have this kind of screen to create a visitor filter. So, you can create these buckets. So, right now I said, hey this is the guy who came to end gadget show in this ad. You can actually give a lot more variables there. And what you can do is you can say, hey by dimension you say revenue that of the product he was looking at is greater than $100. What else can you say? You can do sequences. So, you can say this guy first added a product to cart and then removed it from cart and then added it again. I want those kind of guys. You can actually target those as well. So, there are a lot of very powerful ways you can slice and dice these cookies based on a simple UI. Now, let's see something further. Another use case. I go to Forbes.com and I read up some real estate articles. Oh, REIT is getting tough. Is it a good investment for retirees and so on? I am browsing this on Forbes. Then I suddenly start getting these ads. I see some housing.co.in ad and it knows I am in Pune and it is showing me listing. I have never even heard of this site. I have never even gone to housing.co.in. I see some South-Bees international reality. Whoa, I am never even going to buy there but I cannot. And at the same time I have never heard of them but I have not gone to them. Remember the first scenario? I had gone to end gadget and it was simple re-targeting. I don't quite know. Completely out of the blue. And yes, I was looking up some real estate articles. So it kind of makes sense but I just don't know the math. How did this exactly happen? Let me explain that piece to you. So the moment I went to Forbes, there were 30 cookies dropped on me. Bomb of cookies, right? Just cookie, cookie, cookie, cookie. How many of you use ghostry? Ghostry, amazing, amazing plugin for Chrome. Just search for ghostry plugin, G-H-O-S-T-E-R-Y. It gives you the number of cookies being dropped on your machine at that point and you can selectively block those. So the cookies that get dropped here are from a wide variety of players. Some are analytics companies. So Forbes wants you to track how many people are sharing this, how many people coming to this and so on. And then there are the blue guys and experience and data logics of the world. These are data exchanges. Poor dropping cookies to profile customers. And this piece, how does this work? In terms of if you do a control you on any page, if you do a control you on this fourth page, it looks a very normal page. But there will be one little .js file, some link, which would in turn look as crazy as this. Actually this is the source of the Pubmatics cookie file that they throw in. And in turn this one file, this one .js file would fire away a bunch of requests. So one request that you think is one request in turn turns into 30 requests and that's how those 30 cookies get dropped. Could be done by multiple of these cookies. So that's cookie piggybacking. So you throw one cookie, you can throw more cookies. And it's a matter of convenience. It's also a way of tailoring this for the user. At the end of the day a lot of these companies, most of them, have these privacy rules set well. No personally identifiable information plus blah blah blah. Stick to those things and it makes the experience better for the user on the internet at the end of the day. So these cookies that are dropped ultimately are passed on to advertisers. I'll show you how that's done. So I'll take the example of Bluekai. So Forbes is a Bluekai partner. What's Bluekai? Bluekai is a data exchange. So what data exchange means is they have a bunch of these partners. Forbes is one of them. So all those partners right there on the left, some block sites, some data exchanges themselves and so on. All of these, these little icons next to them, what they signify is what type of user cookies are they making available to Bluekai to sell to the open market. Now these cookies, these are types. So all of these users that go to all of these sites are being classified as either people who like finance, people who like real estate. You know, there's even political affiliation. You can say he's a Democratic, he's a Republican. You could say things like, okay, what are the last things that this person has bought in the last 30 days? Remember, no personal information, but something that tells you about the behavior. Something that you could make an educated guess on in terms of thinking about how to target this person. All of this gets categorized into, all of these cookies get categorized into different buckets and say, oh, this is my retail bucket. This is my finance bucket. This is my political affiliation with this party bucket. And all of these go into Bluekai, which is a data exchange. From there, Bluekai shares that with hundreds of ad partners. So these are places where ads are shown. So you would have the Google double click. You would have a lot of these different folks like AdRoll, AdMeld, and so on. Now they would actually display ads. And that's how the connect is made. So finally goes from the publisher, which is Forbes, in detail terminology that's called a SSP, a supply side partner, goes to these data exchanges, ad exchanges, ad networks, and demand side platforms at the end of the day. So that's kind of how the life of a cookie traverse is in terms of getting you this information, passing it cross-site. So when you say, whoa, how did you know I'm interested in real estate? That's kind of what's happening behind the scenes. And this is just one flavor of it. There's a bunch of ways, you know, cookies go from here and there. So that brings me to the second part of the talk, which is the variables. So what are these variables? Now, there are a lot of things that happen in terms of how do you target a user. So when you're doing advertising, you have to know who your target audience is. Just like when you're doing sales, you need to know whom you're going to sell. When you're doing marketing, you need to know whom do you want to reach out to. Now given a lot of these cookies around, given a lot of, even Facebook, offers you a lot of options to target people. You can target by age, by interest, by employer, by fans off, by your connections, and so on. In Google Display Network or other display networks, you can target by interest, by the context of the page, by the interest of the user. Gmail's sponsored ads allows you to target based on what's in your inbox. And it's about the user, not about the content. So a lot of these things have a lot of parameters in terms of targeting. What we do at Socrates is simplify this into user personas. So we have something called the Socrates persona identification, where we use a combination of these. So some are intuitive correlations that we have made over the past. Some are hard data from cookies where we say, oh, this is, this interest corresponds to this intent. And hence, we make some conclusions on what that person would buy. Again, remember, so we at Socrates, we are always on the advertiser's side. So we never share cookies. It's only for that advertiser. So when we are authorized to collect something from a advertiser, we only use it to advertise that advertiser's products. So every company has kind of their own internal policy on how they want to use cookies. So let me give you a quick example. So you have to give you a sense of what kind of data flows in and what kind of parameters need to be captured. So here's a user. He comes to Google, searches for a car, sees an ad, clicks on an ad, ends up on the page. Now assume this for a single advertiser. We deal only with one advertiser at a time. We're completely blacked out from the others. So when this is one advertiser, let's say the advertiser sells cars and cats and monkeys and everything. So the user searches for a car, clicks on an ad, comes to the site, expresses interest there. So we kind of know, hey, this user interested in this keyword, interested in car, comes to the site. Right? We show that info. We just know this is one user. Now we throw more ads on Facebook and we don't know if it's the same guy or not, but somebody clicks on an ad for a cat and comes to the same website. But lands on the cat page because it's a cat ad. Now what happens is because it's the same computer that the cookie was dropped, it's the same cookie. So if you know that it's the same guy, now we know the guy likes cats and cars. It's the same person. Another person also does something on the website. So the person could browse through cats and cycles and some education stuff and they know, okay, this is the guy who likes cats, adds education and so on. Now let's say he goes ahead and purchases a cat. If he purchases a cat, we tag him as a purchaser of cat. But do we show him? Now most mistakes that most companies make is, great, he purchases a cat, let's show him more caps. It's so wrong. He's not going to buy a cat. He's going to buy the other things he browsed through. So now the thing to show him is the car, is the education thing, is the little cycle. That's how you're going to get him buying more. Or you show him cat accessories. That also works well. So we see this day in and day out and these common strategies finally have their core in terms of what is the persona that you're targeting. Let me go through other variables quick. The ad content is very important. So we always try multiple variations in ad. You would think of an ad as an ad, but not quite. You know, when Snapdeal has 200,000 or plus products, or India Dines, or Flipkart, you know, big companies, big sites, it would be a large inventory. All of them have so many products in there. You can't really tailor made ads for each one of them. At the end of the day, you have to figure out, okay, these are the product lines that are selling. I'm going to have multiple variations for these. And so what we do is do try out variations across products. Now variations involve the background color, important. Variations involve the text, of course. The text is important, especially on Facebook. You know, if you're targeting a married man and you have a headline which says, what is the last time you have surprised her? You know, it evokes a response. If you have the picture needs to be matched as well. So if you have an ad creator which has a picture of a couple with a headline saying, what is the last time you have last surprise her, targeted and married men, maybe 30 to 35, it just works. The CTR is a fire higher than any other targeting that you do. So it's a combination of these background colors, these little images, the text that you show in, and so on. Other pieces, seasonality. There's certain products that get sold within a span of one hour. You know, first click to the ad down to the last purchase. So recharges are that example. You click on ad, you're searching for online recharge for idea, and then you click on it, you come to Paytm, you're like, oh great, Paytm, awesome. Let me click and let me make the sale. Five minutes from ad click to lead. Versus you're buying a TV and you're going to do a lot of research and a lot of research. Smart TV, smart, smart TV, super smart TV. And you're never going to buy it for at least three months. So the time before which you have to wait till the actual lead materializes is a lot more. The seasonality factor is huge because there's certain things you buy in certain seasons. Now seasonality just don't think in terms of sun, moon, wind there. It's also in terms of festivals. It's also in terms of downturns of market. It's also in terms of, hey, you would be looking for that weekend, get away with your girlfriend towards the end of the month when your salary is in. And there are certain things you would do when salary is shown in your bank account. There are certain things you would do just before the week that salary is going to come in. You do things just before you're going to hit your bonus. And all of that is seasonality at the end of the day. So that's another element that you need to keep in mind. The whole color palette, not just for the ad, but think of the color palette for the placement as well. So what we often do is look at the colors, the primary colors in the placement page. So you have thousands of placements, right? We have these algorithms to figure out, okay? Let's get all of these placements. Let's figure out, oh, there are black pages there, black background pages. You know, white ads are going to do well, not dark backgrounds. And so how do you use that as another variable in this whole mix? So that's kind of some of a brief taste of some of the variables. Primarily for targeting in terms of the interest, there's a lot of things in terms of the ad creator. And the mix and match of these is what creates a good ad. Now coming to the third part, which is the smart targeting piece. How do you actually make sense out of all of this? So great, you captured all of this, you have all this data. And, you know, from a big data perspective, you're like, awesome, dude, this is super. I have 2 billion records in my database, and you're all happy, and you're touting it and everything. And that's really good, but how do you use it? So I'll give you one technique that we use. So here there's, you know, hundreds or thousands of audience segments, like we talked about, the user personas. There are these hundreds of products that we could potentially sell to the person. Then there's hundreds of combinations of these, the ad creatives and placement. So if you try and do, you know, a combination of this hundred times, hundred times, hundred, you're going to run into a lot and lots of combinations, because you need to map the person or do the product to the creative, to the placement and so on. Typically it's a, you're looking at combinations. So A times B times C. So there are two ways of doing it. Let's say you had, even consider just three variables, each having 10 variants. So you have 10 times 10 times 10, 1,000 combinations possible, right? Now what happens is, if you, the options that you have is, forget it, let me just say these 10 variants won't affect anything. This targets 15 to 45 age group, all females. That's one way of going about it, where you say, I know jewelry is only going to be bought by females. And maybe you're just making a big mistake, because maybe there's that 30-year-old male close to the anniversary who's going to do something. And if you don't try, you would never know. And so it's a mistake sometimes making these broad assumptions, kind of just to save that one ad or the creation, the process of creation of those 1,000 ads. The other thing is you could go highly fine-grained. You actually launch all the 1,000 ads, right? Even that's prohibitive, because 1,000 ads, launching bills manually on Facebook, extremely prohibitive, not happening. So what we do at Socrates, we use good old design of experiments. How many of you remember design of experiments from school? It's a controlled experimentation. It's a very proven statistical technique used all over the place. We use it very heavily in terms of ads and ad targeting. So we use design of experiments. What design of experiments does is it says, and I won't get into too much detail, but it says, hey, if this is a cube, this is three-dimensional space and there are three dimensions, you give me values for this and this. I will tell you what the values for these two are. You don't need to do combinations to get me these two values. And if you give me this and this, I'll tell you these two again. And so just getting these four values, I'll get you, you know, 8 or 10 values together. And so then when you multiply this a lot, so what happens is 10 times 10 times 10 becomes 10 plus 10 plus 10. Just with 30 combinations, you are able to conclusively say what variable had the most effect. So if you, for example, say, I'm going to try an ad for married men, married men 18 to 20 with a cat, and let's say it worked. Now you don't know if the cat worked, if the man worked, if the 18 to 20 worked, or even the location worked. You know, that's a big one that always exists. So you launch a bunch of these, let's say you launch 40 of these, lots of combinations, male, female, age group changes, location changes and so on. And what design of experiments does is it spits out those 40 so accurately that if you take scores of those, you would very conclusively be able to say, male works, or 18 to 20 does not work. And that's the beauty of design of experiments. So overall, you know, it's always shown us that it crumple, it condenses these combinations that we have into various options. So you say, okay, this persona, we have only to show it with these three and this. And at the end of the day, what we could do then is we have scores for every variable. Every little variable there has scores. So you say, oh, this persona, I know this guy. No, he's orange. Okay, you know, neutral. Oh, this thing, totally red. He's never going to buy. This apple totally red. Don't even think about it. And so a lot of these variables just get eliminated like that. And based on that, what we then do is we say, okay, let's just throw this bucket away. Let's just look at these two buckets. What are these customer sets that are important? So we say marriage is working. Male is working. Male is working for this product line. Female is working for this product line. Hey, for people who want brands, they're working with discounts. You know, people who are very brand conscious, it's kind of counterintuitive. But if you always want to wear Levi's, and you are really hung up about the Levi's part, maybe not the quality of the fit and finish of it, you really want the Levi's tag on it, you're most likely going to be also very discount conscious. You know, if you get the Levi's at a 50% discount, you're really delighted. And that's what you're looking for. The guys who don't care about the brand there, don't care about the price. And so we've seen orders happening where a person is just browsing, he doesn't browse by brand, he's just looking for something, or she's looking for something, and she finds that one stone, that bright yellow color, something floral for the summer. She's going to click on it, and the average order size is always going to be higher than the brand conscious person. So a lot of these insights come out of all of these experiments that get run like this. So at the end, what happens is very simple, you throw all of this data together, you take the greens and oranges, you say these are the parameters working, you can even slice them and dice them again, and what you tend to do is you expand them as well. So you say, okay, I got to know that tier 2 cities is working for me. So tier 2 is my best bet. Now within tier 2, what do I do? This product is kind of the kind of guys or girls who like Multani Mitti. They're going to say, okay, in Pune, they're attracted by something else, but in Kolhapur and Nandhe, they're going to be attracted by something else. So Pune girls like Multani Mitti is one set, and then Kolhapur, Nandhe girls who are going to college, all of them one set, and maybe that's the same thing. Maybe girls who have a scooty or a peppy or an activa, those form one set. So a lot of these combinations ultimately get programmed into these networks and are thrown across. That's kind of how the entire process works end to end. A quick snapshot of it. Of course, I've not covered hundreds of other elements in it, but in this little time, I thought I'd give you a quick overview of the cookie beast, how extreme can it get in terms of how much can be targeted. So a quick advice for you, when you're doing any marketing, doing any type of targeting, be smart about it. You can get the right user, and users like to have the right ads. All of these ad networks, the Google, Facebook of the world, even I remember from Google days, it was always the ad team wanted to have the ad as relevant as the search result. And so it was almost, it was church and state, very separated. And the top search results should be as exciting as a top search ad. And so all of these companies are wanting to make the ads right, companies like us, startups like us, coming to make sure that happens. But at the end of the day, you want to throw the right ad at the right user. So all of it's possible. Just ask and just look through the right settings and options. You'll be good. Any questions? What part of it is real-time? What part is back processed? Good question. So the question is what part is real-time? What part of it is back processed? The retargeting bit, where you deal with real-time bidding exchanges. So when we throw ads on Facebook with retargeting, that is using Facebook exchange, that is real-time. So whenever you're here to make real-time bids, where you say, hey, I got that user, how much are you going to give for him? 5 cents, 10 cents, 15 cents? And literally you have to return a bid, which says I want to give ad to, if it comes at 10 cents, I'll be okay giving ad 3 if it comes at 12 cents. And these are options given to the exchange and then boom, it picks one and ships one. So anything dealing with real-time bidding exchanges is always real-time. The cookie collection, the comparison and offline analysis is batch. It's not as feasible to do that real-time although good enough to get to that. But most companies do it in batch at the end of the day. What about the time-sensitive products which you are retargeting where the lifetime is like 18 hours, 24 hours? Right, right, right. So for retargeting, simple retargeting, the batch piece, or sorry, the real-time piece is needed. And real-time isn't near real-time. I think even an hour and so it's okay. It's not like I go to the site and immediately I'm going to see that ad there. All the layers there, remember, we tossed it through an ad network, a publishing network, then a data exchange, then ad exchange and so on. Everyone, I'm sure, has a half an hour at least of a certain processing time. So there's some delays there. Thank you. Yeah? Sure. So the open-graph search and open-graph API. Yes. So the open-graph API what Facebook does is provides a lot of tools to advertisers like us and provides a lot of options and defaults, good defaults for users to kind of manage their privacy. So open-graph is only possible. The open-graph API, for example, is you can really mine it well if the user has opened up API access. If the user hasn't, then you really can't do anything. You get minimal info. So you do get name and profile ID and so on. There are some powerful constructs such as that can be connected on Facebook. So there is profile ID and profile ID is, I don't know if you've heard of custom audiences. It's a feature in Facebook. Custom audiences, what you can do is you can give an email list. You can give a mobile phone number list or you could give a profile ID list. Any of these three and you say, I have 10,000 of these. What Facebook does is says, great, I found 8,000 of these on Facebook. Now what ads do you want to show them? And you can then put ads on them. So that's a custom audiences regular feature on Facebook with regular bidding. So if you use the social graph API and open graph API, there is a connections targeting available in Facebook. By connections, it's about as much data you can gather by profile ID. And given the defaults of privacy are off, rightly so, it's hard to kind of mine too much of it. You won't get too much of profile data. What you can get is if you look up fan pages. So some fan pages do give you some data. You know, you may get the likes for a certain cover photo and you'd get the profile IDs liking that photo and so on. So there's some interesting things that you can do. And we do observe some of that when making these decisions. So scraping we stay away from in terms of... So we use scraping when it comes to the site that we're dealing with where we're looking for out of stock. You know, that's one instance where we scrape where there's an ad running for the cat and the cat's out of stock. So the website says out of stock and we're driving traffic to it. So what we do is every half an hour we scrape the websites to find out out of stock. Now that is done with permission from the website owner because they are our advertiser and so it works. But scraping a Facebook or a Google is strongly not recommended because you'd hit a wall in a day and ten days and so on. So the moment you see, hey, it's successful, it's working, you know, they'll shut you off. And then you get into the cat and mouse game. So what we believe is if there are certain rules, you just stick to them. It's easier spending your energies and doing it the right way. And most of the time there is a good right way. Yeah, good question. So networks, advertisers, so there are two kinds of advertisers. They are the advertisers who are about performance advertising and they are the advertisers who are about brand advertising. The guys who are doing performance advertising says, dude, I want to sell this cat. I have a 20% margin on the cat. You can blow up to 20% on it. You can spend 20% of the MRP of the cat and I'll be in profit. At least I'll break even if you hit total 20%. Get it as low as possible. So it's kind of a margin kind of model and it's very performance driven. So they don't care where we show, how we show. As long as you meet those numbers, everyone's happy. There's the brand advertising group which deals with reach and frequency. And they're slowly moving online. So you see the big brands still love TV. They still love print where it's about reach and frequency. The logic is if you keep hammering a person over and over again, so the moment you go to a store and say, I want a washing machine and the guy's like, what brand? And the first thing that will come to your mind is the last ad that you saw. And so it's about how many times have you hammered the person with that ad and that brand name that you say, aha, I want it. Those are different kinds of customers. And so they ask about reach. So they're the ones who want to do YouTube Master's and YouTube First Watch to block YouTube for a day. So every day you open up YouTube, the first video that you watch will have a pre-roll ad and many other times it can't be skipped. That's the first watch roadblock for the day. So we do it for a lot of clients, especially branding clients. They just want to get the word out. Then they get smarter saying, OK, now that I have the word out, can you hit the same user again with my messaging wherever else the user goes? So now you can you reach that user on the Google Display Network and so on. So they kind of get that frequency after the initial reach. So depending on the advertiser, yes, they definitely don't like the targeting if you're reducing the customer set too much if it's a branding plan. You had a question? Yeah. So technology is across the world. So a bunch of them actually on the Fifth Elephant Bangalore list of proposals. We have all kinds of databases. So right from a Rades to a Mongo to MySQL. So on everything over that on Java. We've learned a lot about two surprising things over time that we've seen. MySQL works. As much as we kind of poo poo, hey, hey, hey, this is bad, that is bad. It still works and there are no bugs. We've used Mongo successfully and everything, but we've ended up uploading things and patching up the course or scored and doing things like that. Inventing stuff along the way. When you are on MySQL, you don't kind of need to do that. Everything's there. There's actually a talk about it, a proposal about it which we're describing, you know, what makes us two kinds of proposals. One is how do we maintain two billion records and still use MySQL. On the other hand, in terms of the mix of databases, what do we gain by using so many different databases at the same time? Everyone has their advantages. You do need a Mongo, you do need a RITS in conjunction with MySQL in the same system. So that works as well. One other thing is about the Hadoop kind of part of it. So yes, when you're mining site data and all, we kind of use Hadoop to do things in batch. It can't be that real-time thing. You're picking up stuff and so on. And it's great for archival. That's where you start with. And slowly you realize the power of it and so on and kind of upgrade. But being a startup and everything continuously moving, you have to keep the system alive and do this. It's always not a grand switch. So it's a mix of databases and I think it's worked very well for us. So real-time, so we use a bunch of networks. So ad exchanges and networks in turn for a lot of the real-time pieces. There's some real-time pieces that we have our own servers where we shoot things back and forth. So there are different kind of elements have these things. So most of these things are bad at the end of the day. Where you have, say, Facebook or Google, you have to toss a bit or something. It's Google who takes care of most of the hard work or Facebook who takes care of most of the hard work in terms of where it goes. Your ads are already there. It's about you telling him I want ad number 35. So that's an easier problem. You don't have to deal with CDNs across the world serving ads and so on. We are not as much on the ad serving side directly. So ad serving, per se, as I was mentioning, we don't serve the ad. It's the networks that serve the ad at the end of the day. The Google display network or the admins of the world or the content network. So for certain parts, let's say for directly talking to Facebook on Facebook exchange, for example, that prototype is on Java. That's still working on Java. It's decent. Don't feel the need yet to go to see, but it may change. That keeps changing. It also depends on where their server is. What is the computation needed on our side? I think the computation matters. So as we get more and more, as we do more and more in terms of the computation, nothing we are revolving on is right now Java. Right. Good question. So who dictates the policies here? So to my knowledge, there is no very firm law, no very firm central body that actually does it. So every company has its own way of doing it. What we do internally, we have a review where we actually look up, decide on what are these policies in order to follow. For example, when you have a cookie toast on someone's header, you could pretty much extract any data. But we have very strict guidelines on how to do that. So basically we have a list of all the parameters that we are going to take from the website. We send it to the advertiser, our client, and say, hey, these are the parameters that we plan to capture. Here is how we are going to use it. And here is how much we are going to store it. Is that OK? And then the client may say, you know what? No, take off email. I don't want you storing email. Like, sure, fine. What else? Take out the time of day. Sure, we take that out. What else? And based on that, back and forth, we respect whatever the advertisers told us. And if you are the advertiser, right, if you're working with the advertiser, and let's say you have this e-commerce company that you're working with, and you have a cookie there where you have tossed a cookie with their permission. Actually, they need to put the cookie. Then what is captured is something that you both decide. And that's how you come to that. So we have strict policies in place on how to ask for that permission from the client, what to capture, how to capture it, strict guidelines on flagging it. Hey, dude, this PII, do you really have permission for it? And then internally, you know, that's probably another talk, but internal security for all of this is important. We're capturing user data at the end of the day, so access to only certain level and above engineers, only working on the project, or only on-call for that system. We have all of those processes set to make sure the data is always safe. Good question. The way code evolves, you know, I'm sure everyone's part of it. It depends on how soon you want the thing out. Anything that's critical, we strictly have a very stringent process of code reviews, designs, and so on. In terms of the algorithms and all, it's very much a plug-in play kind of framework where you have ETLs running to transform data the way you want it. There are certain processes that happen in batch offline. There are certain things, I'd admit this is not a perfectly joint system, right? Let's say you say the first time we did DOE, design of experiments there is, we knew R can do it. So R had a library for it, so we found a library in R that can do design of experiments. And great, we used that. So that ended up becoming a batch process. You say, here are my targeting variables. Oops, I need to jump to R to get that output there and now jump back to my system. So yes, those quirks exist, but we tend to make all the interfaces very crystal clear. And the moment somebody can join it, great. It just goes smoothly. But it evolves in terms of interfaces. Interfaces is key. You have these interfaces and everyone is a black box. Three of you running a project. Don't tell me the details. Just tell me what goes in and what comes out. Great, I'll trust you for it. Now who's next? And then you just plan how the data is going to flow at the end of the day. Yeah, of course. So mobile is, of course, a really big one in terms of ads. There are two challenges with mobile. One is manufacturers are going away from allowing device id, ud id to be exposed. So how do you track a unique mobile? Most of the, many of the mobile browsers like ios, safari, and so on, don't allow cookies. I'm like, dude, what? No cookies? Then what am I going to do? So you need to develop cookie-less tracking. So you need to figure out how do you want to save certain parameters on the server based on what the browser could return you. So advances in terms of cookie-less tracking for, you know, and fingerprinting. So how do you know? The ad that was clicked and the app that was downloaded and installed. The only person who knows that the app was downloaded is a Google or Apple, right? Because they own the store. And you don't have a cookie there. You cannot target anything there. You just, it's a black box for you. So your only hope is to track it when the app gets installed. And when the app gets installed, it fires a wake-up call. Like, I'm alive. Somebody made me alive. Like, I came from this click from this campaign. So you need to correlate that ad click which happened in a browser that cannot store cookies to an app that is installed in the OS of a system. It's a tough problem. And that's where we're also racking our brains on have good prototypes there in terms of fingerprinting. The other thing is how do you do this across devices? So now that you have a tablet and you have a MacBook and you have a Samsung phone, you know, how are you going to, how do you know that it's the same person? So there is a lot of research around that as well. So, yeah, definitely we are on it. Hope to come up with something cool soon. No, so in terms of the mobile tracking, right? Or cookie-less, right? Right. Right. And then that's a challenge. So SSID is out of bounds. SSID would be really cool, right? So what captures SSID is apps. So, you know, say your Maps app and so on can capture SSID. So it can kind of correlate, or this Wi-Fi router corresponds to this GPS that was tagged by the same phone. So I know the location of that router. But those things you can do when you're on an app. You can't do that when you're in a browser. So in terms of browser, you know, IP, definitely you have it. But now anybody accessing that site from ThoughtWorks would just seem the same person if there's only a single IP pass. So there are clearly other parameters, whether it's behavioral parameters, whether it's certain things you can run on the JavaScript, maybe it's the cookie versions or the browser versions and so on. But that's the exactly why it's a tough problem to crack. Sorry, I think I'm over time. Great. Thank you very much. I'm going to travel and subscribe. I think I also should go up here. I don't know if this presentation is something we can put online with your app. Yeah, sure. I'm not at Google now. Sir, do we have confidence? No, no, I didn't take that out. Okay, okay. I know. So if you can email me, I'll add it to your talk today. Of course.