 All right, sounds good. Hey everyone, my name is Santhosh Naloo. I work at Distant Networks. Our office is based out of Arlington, Virginia. We have about four other offices in San Francisco, Stockholm, London, and Raleh. I work as a solutions engineer at Distant Networks. I've been with the company for the last two years. So, let me start off by asking you guys, what is a bot? Do you guys know what this session is about? So, Drupal basically partnered with our company so that they can protect their web applications from automated threats, which are trying to generate fake registrations on their website. And at the same time, they're also trying to use our high definition fingerprinting technology to identify users who are trying to create multiple fake accounts. So, let's see how Drupal used us for this use case. But let me come back to my question again. Does anyone know what a bot is? Yes. So, creating accounts is one of the use cases, but a bot is basically a piece of script or a tool which makes that CDP request to a web application. And you could also use tools like Selenium, which automatically drive legitimate browsers like Chrome or Firefox, so that you don't have to manually keep doing what you're doing. So, you can basically pick up a list of username and passwords and start brute forcing a login. Similarly, you can write a script to use a list of usernames and passwords to create new accounts. So, Drupal has been fighting fake user registrations for a long time, and it was difficult to stop this spam or fake account creation using manual techniques like IP blocking, because the bot's landscape has been improving day by day. So, as soon as you block a specific IP, the bots start using proxies or other ways to basically, or VPNs to basically attack your website, and it's easy for them to keep cycling through IPs. But from a security standpoint, it's a nightmare because it's hard to keep track of these IPs and play the whack-a-mole game and start blocking IPs because they keep coming back. And that's one of the main reasons why distil came up with the concept of browser fingerprinting. Well, browser fingerprinting existed before distil, but then we took it, we hardened it, we built our solution around it. So, let's talk through distil networks and how we partnered with Drupal to help mitigate their issues with automation. So, distil is a solution which has been helping thousands of websites from automated threats the use cases for automated threats can range from web scraping or account fraud or spam and so on. And we'll talk about a few of them as part of this presentation, but Guttner recognized distil as the only bot medication vendor in 2015 and 2016. And recently, SC Magazine recognized us as a winner for fraud prevention. And we have a bunch of logos at the bottom so that you know who we work with. Anyone familiar with Neiman Marcus? And did you guys hear about the attack on Neiman Marcus? Yeah, so last year there was an attack on Neiman Marcus where a bot was buying different username and credential combinations, username and password combinations. And they were trying these username and password combinations against their login. And once they are behind the login, they get really sensitive information about individual users and they were trying to basically use this information for bad activities. So they take the credit card or buy stuff and so on. So, after that attack, Neiman Marcus started looking into bot mitigation solutions and that's when they discovered that distil is the leader in the space and they started partnering with us. StubHub, anyone buy tickets from StubHub? So, scalping is one of the major issues with ticketing industries because we know that there's bots which are out there which are trying to hold tickets and buy tickets cheap and then they're trying to resell them so that they can make some money out of it. So it's a huge problem and legitimate users will, because of this, are not getting access to tickets for some of their favorite events. And that's one of the reasons why StubHub decided to partner with distil and we stop automation from scalping and other threats on their website. So what were the challenges with bots on the Drupal websites? So spam bots were actually writing comments. They were signing up for fake accounts, et cetera. And the problem is that the Drupal website was a wide open website for user-generated content. So anyone can just basically create a login and then just start commenting on their website. So the problem with this is like I mentioned, when people are creating new accounts, how would you know if they're fake or not? How would you know if they're legitimate human users coming in from a legitimate browser like Chrome or Firefox? That was the issue which Drupal was facing. And the problem with the modern spam bots is that they are behind proxies. So there can be one specific bot operator who is cycling their CDP request to 500 or 1,000 different proxies. And it becomes extremely difficult for you to keep track of all the different IPs which they're originating from. You won't even know that it's the same user who is accessing your application from using these different proxies. The same goes with VPN networks. If I'm a bot operator, I can just subscribe to a VPN network and start cycling through different IPs and route different CDP requests through different VPNs. So what was the solution? So the solution was that distil.org, they put the registration process through the distil cloud and we were collecting the high definition fingerprints which distil generates for the individual end users. And we were basically tracking these users on a hash value rather than an IP address. So instead of, if I'm a bot and if I'm cycling through different IPs, because of distil, I was no longer being tracked based on my IP but based on my fingerprint which was generated by distil. And because of that, even though I was cycling through different IPs, it's easy for distil to keep track of this bot and mitigate it. So before distil, there was a high rate of unconfirmed users which were being created on the Drupal website on a daily basis. But once the Drupal started leveraging distil's technology, they started blocking a high rate of bots on their website and at the same time, they also used the high definition fingerprint for identifying legitimate users like human users who were creating multiple fake accounts manually. So there were two different ways which distil Drupal was using distil. One is to use our automated systems to stop the automated bots. And the second use cases to use the high definition fingerprints and identify human users who are creating multiple accounts. So here's a graph of the results of using before and after distil. So as you can see, the pink line here is showing us the average number of unconfirmed users before using distil, which is about 300. But then right after they started using distil, it tanked and it basically ended up around 120 which is a huge difference. And then once we started fine tuning the system a little bit and the distil platform is a combination of multiple policies. How many of you are experienced with using web application firewalls? Okay, and as you know, a web application firewall has a mix of CVE signatures and other bad bot or attack signatures. And at the same time, it gives you the ability to track users on an IP level. So if there's a certain number of requests which are being made from an IP address for a certain duration of time, you can identify that it's a threat and it's not a human and then you can stop it. But the problem with that is, simple bots used to do that. They used to come in from one IP. But like I said, the bot landscape is being improving and now we are dealing not just with simple bots but we have moderate bots and advanced bots. So moderate bots are the ones which cycle through maybe a few IPs and then change a couple of user agents. But the more advanced bots are using tools like Selenium with drive legitimate browsers like Chrome and Firefox. And I have a live demo for you later in the session where I could show you how an advanced bot works and how to still identify it. So the problem with Vafs is they stop at the simple bot but with distil, you can identify most of those moderate and advanced bots which are hitting your web application. And here's another graph which is speaking to the same results which we talked about earlier. So Drupal once they started using distil, the number of block accounts drop to almost zero. All right, any questions about how distil was being used by Drupal? Yes, that's a good question. And the next slide actually talks about it. All right, so how do we generate the hash or the high definition fingerprint as we call it? So distil's solution is a multi-layered fingerprinting technology. So there's a few levels of these fingerprints which we generate. The most basic fingerprint is being generated using well-known signals like IP address, user agent, HTTP headers. But these are all spoofable. I can use a plugin on my browser to change the user agent. I can use a VPN or a proxy to change my IP address. I can use a plugin to change the HTTP headers. So it's easily spoofable. That's one of the reasons why we decided to look beyond these basic signals. And so as part of that, we collect a lot of other information like the 200 plus attributes. So when you route traffic through distil, distil is an inline solution. So you have to basically route all your web traffic through us. We have different ways to do that. You can either use our SAS cloud, which is a 17 data center node. You just make a DNS change, you can route traffic through us just like a CDN. Or you can put the distil appliance within your data center as a reverse proxy and then you could draw traffic from your load balance through distil. Doesn't matter, you know, the deployment model at the end of the day, we are an inline solution which is sitting in front of your origin. And when we start inspecting these HTTP requests, if we are unable to identify the bot on the very first request, we'll let it through to the origin. The origin will process the HTTP request and respond back. And distil will intercept this HTTP response and inject a JavaScript snippet into the HTML code. And we render this HTTP response to the end user's browser. The browser now executes the distil's JavaScript snippet and we are collecting 200 different attributes such as font size, screen resolution. And we are also looking at things like audio, video codex, plugin information, and what else? Yeah, we're seeing if the browser can execute and generate a 2D canvas, which is basically a 2D image. So we have like 200 different attributes which we will talk about in detail at a later point in this presentation. But all this information is being collected from the browser and it's being sent back to distil. And distil uses them in conjunction with the IP, the header, and the user agent to generate a high definition fingerprint which we call the hash. And then even though the user keeps moving from one IP to the other, we are still able to track the user when they are accessing your web application. And we understand that whenever a new technology comes out there's bad bots or bad operators or bad actors, whatever you call it, who are trying to reverse engineer, correct? So for that same reason, we do have some type of proofing built into our solution. So we use techniques like JavaScript obfuscation, proof of work, how many of you have heard about proof of work? It's a JavaScript-based mathematical puzzle. It's called hash-cache-challenge. So we serve a mathematical puzzle to the end-user's browser and we expect the browser to basically execute that mathematical puzzle and respond back with the correct answer. And a browser has the resources to execute these puzzles and respond back, but a tool cannot or a script cannot. Or you at least have to code around it to make sure that you are able to get through the proof of work challenge. So we have these type of proofing layers built in so that we are not dealing with bots which are trying to reverse engineer and generate a fingerprint and get through the distilled detections. Does that answer your question? We are looking way beyond the headers and using the headless browsers. Yeah, so we're, exactly. So we're trying to see if there's automation tools which are trying to mimic human behavior or headless browsers which are trying to mimic human behavior or tools like Selenium which are driving legitimate browsers like Chrome and Firefox. And that's what this JavaScript test is all about. So I'm gonna quickly gloss over some slides which talk about the fingerprinting process so that you get the text side of it and see how Drupal used distilled to identify bad bots. All right, so as part of the detection process, what happens is when you start routing traffic through distilled, the first request hits distilled and then we try and generate a low-level fingerprint at this stage called primitive ID. And then we compare the primitive ID with known bad bot databases which just still curates. So we have a data science team which puts together a list of signatures which we have identified across our customer base and we keep this up to date on a day-to-day basis. And at the same time, this database is also being, is updated periodically by our platform automatically. So if the end-user signature doesn't match the bad bot signatures, which we already know about, then we'll proxy the first request through. And if it matches the signature, then we give the customers the ability to act against these bad bots. So you can stay in what we call monitor mode and monitor mode is where you don't take any actions against the bad actors, but we do give you the flexibility of serving back a challenge to the bot. So you can either serve a CAPTCHA challenge or an email block form. And this allows us to basically validate if the user is able to solve the CAPTCHA and get through or not. So in most cases, we expect the bots to not solve the CAPTCHAs or fill out the email forms or the unblock forms as we call it. So these challenges would help us basically block the bots at the proxy layer. So once again, like I mentioned, if we were unable to match the primitive ID with the known bad bot signatures, we would then proxy the first request to the origin. The origin would respond back with the content. They still would intercept that at CTP response, inject our JavaScript snippet and honeypot links. We'll render that HTML code to the end-user's browser. The browser is now going to respond back to the 200 plus signals, which we talked about. And then the still would use those signals to generate the high definition fingerprint. So that's the still fingerprinting process in a nutshell. And then some of the other policies and techniques which we use to identify bot, honeypot links, you must have used them before in your previous life or in your current life. But this is basically an invisible link which goes onto your web application. And this is only visible to bots because when we inject the honeypot link, we do use a tag called display none, so it's not visible to human users. So most of the simple and dumb bots get caught in these when they click on that link because it's invalid and it generates an HTTP request which allows us to tag that user with a violation. And we briefly discussed about the distilled injection process just a while ago. So it helps us, the distilled JavaScript test helps us verify the JavaScript engine. So we expect most of the users who are accessing your web application to load and execute JavaScript because most of the modern applications need you to basically execute JavaScript. We are detecting automations if there's any Selenium or other tools which are driving the browser, legitimate browsers, then the JavaScript is listening to events which are being triggered in a browser's DOM. So if there's Selenium, which is generating the HTTP request and not a human, our JavaScript can detect that. So we talked about the proof of work, the hashcast challenge. It's a JavaScript-based mathematical puzzle which is served to the end user's browser for us to help and make sure that the fingerprints are not being tampered with. And just to recap, the still is a multi-layered fingerprinting technology. So we do have a primitive ID which is generated using basic information like IP, header, and so on. And then the device fingerprint which is the HIDA fingerprint is generated using the 200-level unique markers. So apart from these black and white rules, so we talked about known bad bot signatures, we talked about our JavaScript test. These are all rules which, whether a bot could fail or a bot could pass. But there are sophisticated bots which are able to load and execute JavaScript. There are sophisticated bots which are able to support cookies. So what about these kinds of bots? So for that same reason, we know that some of these bots are really advanced and that's why we came up with the concept of behavior and analysis. I mean, machine learning is not new, but we basically put our own spin on machine learning and we use different vectors of information such as where the request is being originating from. So the IP address, the geolocation, how many distilled fingerprints are they generating and what parts of your website are they hitting and what frequency are they making the request. So these are the kinds of vectors which we look at to see what looks like a typical human behavior. And then once we create this baseline, we'll know if there's any anomalies which are trying to access your web application. And then we give each of these anomalies a bot's code. Just to give you a snippet into the 200 plus attributes, we talked about our ability to use the basic information like IP, user agent, headers. And we also discussed about the JavaScript injection and how we could collect information such as plugins, convolution, font size, web geol, canvas information, media and so on. And that's our behavior analysis or machine learning process where we are basically trying to take the data points for every user who is hitting your website and we try to generate a baseline based on what looks like human and then everything which falls outside of that baseline, we consider it as an anomaly. So all right, so I am now gonna jump into a quick live demo so that you can see how this still works in action. Sorry about that. So what I did here is I opened up a Chrome incognito session and I have no automation running on this browser. So this is a legitimate browser session which is accessing a website called Santos.Lab, right? And let me go ahead and open developer tool so that I can see what's happening in the background. I'm gonna reload this page once again and what happens is every single HTTP request which is being made by this Chrome browser is now hitting this till first, it's getting inspected and then we are deciding if this is a bad bot or not before we proxy it to the origin. And as part of this, like I mentioned, the first request is proxy to the origin and we inject a piece of JavaScript snippet by default. This is being injected right before the closing head tag but we give you the flexibility to choose where this JavaScript should be injected. And remember that the injection process is something which is automatically being done by distil. We're not asking you to embed this on your backend, on your web application because we want this to be as un-invasive as possible and we don't want you to basically change your application to implement distil. And along with the JavaScript snippet, we are also injecting a honeypot link right after the first anchor tag, like I mentioned, this is an invalid link which is only visible to humans, sorry, to bad bots, not to humans. And when a bad bot clicks on this link since it's invalid, it generates an HTTP request, we tag that fingerprint as a violation. So these are the two injections which distil uses to identify these bots. So let's see what happens after. So when the browser gets the distil JavaScript, it executes it and as we discussed earlier, the browser is collecting a bunch of different attributes like plugins, font size, screen resolution and so on and it is sending all this data back as payload in a post back. And then when distil gets it, we are generating a list of different signatures and we're asking the browser to set these fingerprints as cookies for all subsequent requests. So when the browser is making subsequent calls, we know that it's supposed to come in with these fingerprints. And if it doesn't, we'll force the end users browser to basically, you know, to fingerprint itself again by executing our JavaScript. Any questions on the fingerprinting process? All right, just to give you an idea on how distil stops bots, I'm gonna open my terminal window and use some simple bot scripts. When I execute this call command, you can see that I'm not getting back a 200 okay, but rather distil is identifying this particular session as a bad bot and it's responding back with a 405 status code. So 405 is a status code which is, you know, which represents our capture page. So if you selected the threat response to block or drop, you would have seen a 416 or a 456 status code instead. Similarly, I can use another tool. Let's say I open another browser Firefox and I'm gonna try and use an automation tool. How many of you have used Selenium before? Oh, a lot of web developers here, so I can completely relate to that. All right, so Selenium IDE, it's an easy tool which you could, it's a plugin which you could enable on your Firefox browser. Then Selenium IDE helps you basically automate whatever processes which you're running on your browser right now. So I have my Selenium IDE plugin set to record whatever is happening on Centoge.Lab through the Firefox session and have it running right now. So I'm gonna go ahead and go back to the Firefox browser and try to access Centoge.Lab. When I do that, the requests are now going through distil. We are injecting our JavaScript, we are waiting for that post back and then distil immediately identifies that there's some kind of automation which is driving this Firefox browser. Even though Firefox is a legendary browser, we know that Selenium is an automation. So we're detecting that and based on the threat response which was selected, we are responding back with a CAPTCHA. So we're challenging the end user to find out if he's a human or a bot. If he's a human, he can type in the CAPTCHA and get through, if he's a bot, he's got no other way to basically get through apart from using any other services like deadby CAPTCHA where they can make an API call to a CAPTCHA form and solve that CAPTCHA. When I go back to the Chrome session and try to access another part of the website, I can still do that without any issues because my Chrome session was never tagged as a bot. It was always tagged as a human. So the goal of this exercise was to show you the accuracy of how to still identify bots versus humans even though they're originating from the same device from the same IP. So it just helps with the accuracy and even if bots are cycling through different IPs, we can still track them with the high definition fingerprints which we are generating for them. So how did Drupal use us? So Drupal basically used our policies which come built in with the product to basically identify and block all the bots which are trying to register fake accounts. And at the same time, for human users who are trying to register multiple accounts, Drupal basically wrote logic on the backend to look for accounts which were related to a specific UID. And then when they found multiple accounts which were registered using the same UID, they knew that a human user was making multiple accounts from that same session. Any questions about the fingerprinting process or how this still detects bots? All right, perfect. All right, so let's move on. So this is also, after we started identifying and blocking bad bots on web applications, bots basically decided to take another turn and they started basically attacking API endpoints because APIs as you know are much easier targets because you could just get the data in JSON or XML format which is much easier to use than what you are scraping off of web applications. And for that reason, this still came up with an API security solution where we can sit in front of API endpoints and track the end users who are trying to consume your API endpoints. So we use or we leverage unique identifiers which you might be using today for identifying a session or a device, like a mobile application which is accessing your API. And then we generate unique profiles for these users who are consuming your API endpoint. And we keep track of these end users so that they're not abusing your API by sending thousands of requests in a second. So just to give you a gist of how the API security works, once again, we are an inline solution which is sitting in front of your origin, that is your API endpoint in this case. And once you route traffic from either your native mobile apps or your dynamic web apps or your partners, they still will use a unique identifier or a token, match it along with the IP, and we use the combination of the IP and the unique identifier to create a profile and to track users who are consuming your API endpoint. Think I'm gonna stop right here and see if you guys have any questions. That's all I have for the presentation, but do you guys have any other questions? Okay, sounds good. Well, thank you so much for attending this session. Do you have a question? Like a DDoS flood, SYN flood? Yep. So when we look at API attacks where there's multiple API calls cycling through different IPs, we do have built-in features where we are looking for the number of tokens which are coming in from a specific IP or the number of IPs which a bot operator is using to cycle through different tokens. So we have features to look for token cycling and IP cycling. So if someone is trying to cycle through different IPs in an API case, we could still identify that and block these kinds of attacks. So there's specific thresholds which you can configure for specific end users who are hitting your API. And if you, for example, let's say you don't think that a specific end user is supposed to make more than 25 requests per minute. So you could set that as a threshold. And by creating these profiles, if I am going about that threshold, I'll be tagged with a violation and you can act against me. But if you have different users behind that same IP, they're gonna get their own token. They'll have their own profile. So they'll be able to basically access their API endpoint without any issues. Make sense? Yes. So we do have something called Browser Integrity Check. So we see what the browser is trying to say this, but then we try to validate and see what versions of software or services which it needs to use. If it doesn't match, we tag it with the Browser Integrity Check violation. So just remember that the distilled solution is not made up of one or two different policies, but we use multiple policies to identify these bots. There's no silver bullet to how we do it. It's a combination of known bad bot databases, a JavaScript test, rate limiting policies, machine learning, the behavior analysis, like I mentioned. So based on all these policies, if the bot triggers any one of them, we tag it as a violation. And just so that we still have, do we still have time? Yeah, I think we're over our time limit. But yeah, just to give you an idea on how the portal looks. So this is how it looks. We basically try and break down the total traffic which is hitting your application into different buckets, humans, good bots, bad bots, and white listed. Any other questions? Is there a distilled plugin for Drupal? Is that what the question is? No, so distilled is a proxy solution. Or let me say, it's an inline solution. So you can either implement as a diverse proxy in front of your origin. So wherever you're hosted, you have to basically route traffic through us either by putting a distilled appliance within your data center, or by making a DNS change and routing traffic through the distilled cloud. So there's no Drupal plugin or a distilled plugin which you can implement on your application. Yes. Can you repeat the question? You were not on. Well, we work with the customers who are hosted on different hosting services like Pantheon and Drupal. There's a few items which we have to definitely check and cross check and make sure that we are compatible with the application. But that is a part of the validation process. So when we start working with a specific client, we go through the wedding process. Absolutely. And we've worked with customers before who have hosted their application in services like Pantheon and so on. So yeah. Participated in case study in Lebanon. Sure. And one of my friends who came to me after was like, oh, I saw the case study, it was great. We were like, we told them something like that and looked into it. And it was too expensive for any of our clients, right? Because they have a large stable of small clients. But they were like, is there some way that they still cut off in the service that we could be the account holder and funnel all of our smaller websites through it? Absolutely. As a one, like one call. Sure, absolutely. And we've done that before with others. So you could definitely work with your account executive to have a partnership agreement instead of a customer type of agreement. Well, thank you so much for attending the session today. Thank you. Thank you so much. Santosh, I'm not sure if we, did we talk on the possibly, like way early on, I remember your name. Like a long time. I remember, yes. I was on a couple of calls, but I just don't know if I was in the previous calls, but I was definitely on some calls. Yeah, because of course, you have the 200 things. I remember now, yes. You sent me a list of IIDs and UIDs and then. Yeah, yeah, we were, oh yeah. No, we're data science and you're data science here. Look at the things we have here. You know, everyone, and then you try to basically make those different use cases out of it. Yeah, it's pretty interesting. And it actually depends on the pitch. Yeah, and the, one other thing that I've been wondering about that I was kind of pushing was there was, if we had some way to report back, you know, like we were saying, this person's definitely a scammer. And we want them, we've got that data, and we want to make sure that that makes it all the way back up to your bad bot database. So we can say, this is the IID, UID, whatever combo, make sure you flag it, because we're, you know, it's making it through your system. And it's not flagged, obviously. So we just, it would be nice if we had a way to continue to populate your servers with data that we know is. The problem with that is, the SINs is a solution which aims to do a lot of different amount of sheets. You can't worry about it. Not everyone does. Yeah, Matt, you were saying there's probably a lot more than one of the ones in this. We probably didn't think of the place to be the X5. So the app, this is being reported on the database that's been followed by other customers, might actually be... Right, so our idea of bad isn't necessarily everybody's idea of bad. I'll have this. Okay. All right. I've been like, when are they gonna... Yeah, that makes sense. Yay, a question and answer. Thank you. Awesome. Yeah, very sort of, kind of mildly attended, but drivel of a recession is like, you'd love people to work out of time if there's a problem with that question. What? Okay, actually, you're almost supposed to do that today, but I haven't seen one.