 Okay, sorry about that Lex, thanks so much for that introduction and thanks so much to the organizers for the invitation They've asked for a while. It's never worked with this my schedule It happens that I teach in early May and that often conflicts with this event And so it's fantastic both to be able to be here and As was noted, it's fantastic to get to be here in person again with these kinds of events It's been a long time and it's great to have the opportunity to again get back to face-to-face So I focus on sort of the intersection between law technology and policy There is a lot happening right now in Canada and what I wanted to do was use my time to try to bring you up to speed on What is happening alert you to the kinds of changes that have been taking place? Because there have been some pretty big ones in terms of where the Canadian government is at and in a sense urge you this Community to become actively engaged because these are issues that I think directly affect all of us They certainly direct all Canadians, but particularly those coming from the security technical privacy online communities the Implications of some of these policies I think are profound and it is essential that our policy makers our politicians and others Better understand some of the implications of what they are deciding So I titled it as you can see what lies behind Canada's internet regulation reversal and When I refer to a reversal, I would take us back quite a while back to the late 1990s when our broadcast and telecoms regulator the CRTC was First thinking about the internet and whether or not there was or what the role might be from a regulatory perspective They took the position that they had the power to regulate online services We didn't have Netflix isn't at that point in time But they took the position that the broadcast act would cover those kinds of things But that it wasn't in the public interest or within the policy interest to do so that led to some pretty Innovative if somewhat controversial services, I don't know how many people remember this service here I crave TV. It was in a sense the YouTube before there was YouTube Where this service was able to take over the air television streams Re-broadcast them so that people on their 56k dial-up modem could try to watch this little box of television Television broadcasters and others thought this was outrageous Nobody would ever or should ever be able to watch video on the internet They thought and so tried to find ways to stop services like this but the Canadian policy was largely designed to facilitate that kind of innovation and it was that approach it was pretty Consistent for many years we can go back a number to go forward I suppose a number of years later some of you may recall this incredibly cheesy video that then Prime Minister Harper put forward when he declared That there would be no Netflix tax. He loved Breaking Bad was the television show He decided to talk about and that there would be no there would be no internet taxation or no Netflix taxation In fact even in fairly recent times under the liberal government. We had then heritage minister Melanie Jolie Look at this issue and come to the conclusion that Canada was better off working with many of the large internet services like Netflix Rather than moving towards mandated regulation In fact this picture here, which would be unthinkable in the current environment is not didn't take place that long ago This was the launch of the sidewalk labs So-called smart city project in Toronto the smart city proposal between waterfront Toronto and Google I was I served actually as waterfront Toronto's The chair of their digital strategy advisory board, but the launch of this event which of course includes the prime minister includes the then Premier of Ontario Kathleen win the mayor of Toronto and google executives Justin Trudeau doesn't want to be in the same city as a google executive right now much less appear on the same stage as one I'm not sure that that's right. I don't think that's that's that's the kind of policy positions We ought to be taking but that is where we stand now And even as recently as the trade agreement between Canada the United States and Mexico there was policies put into that trade agreement that were Consistent with sort of pro internet pro innovative approach that's eschewed some of the regular regulations that we're starting to see now So for the better part of two decades in Canada, we had a largely hands-off pro innovation type approach One in which that those kinds of policies were imbued within our trade agreements Within the kind of regulations we saw whether that was on broadcast or telecom copyright a whole range of different things Well, where are we now and why the change because there has been a significant change The kinds of things that we see being focused on right now either passed or debated right now in the house of commons in the senate Include changes to our elections act changes to taxation related policies and a series a trio of bills Bill c11 which deals with online streaming c18 which is described as the online news act And one further bill that we isn't a bill yet But will be one in the not too distant future dealing with online harms or now the government has framed it as online safety I think it's worth noting what's not on that list So for example privacy a key issue that we want to talk about on this In these groups of talks is not or has not been a top government priority There was a piece of legislation that sought to reform canadian privacy law introduced in the last parliament But it went absolutely nowhere. I pretty much at this stage talked about as much about the bill Here as they talked about it in the house of commons. It was brought forward They had scarcely any debate it never went anywhere Despite the fact that I think there's a lot of people that would say that privacy ought to be one of the core priorities So too with competition law concerns around the role that some of the largest internet platforms play And trying to ensure that our competition laws ensure a pro competitive environment where there's no abuse of Dominance and the like yet that hasn't been a major priority Taylor owen from the gill argues that the kinds of approach that we see both in terms of what's on the list But notably what's not Highlights a government that seems to often be focused on symptoms rather than the cause The problems that we see whether that's Some speech related concerns or are some of the kind of Competitive concerns we see around content Tend to be symptoms of a broader problem not the cause of the problem yet The approach we've seen to date tends to focus on those symptoms I like to think of this approach more as the government viewing the internet as a basically policy as a policy Atm where there's plenty of money to be taken out Just about all the legislative initiatives that we see Invision new payments coming from the big tech companies in order to fund government policies And in fact they want the tech companies to fund the regulatory processes as well Now there's a compelling argument that those companies need to contribute more But at a time when we are concerned I think or many are concerned about our dependence on those companies Establishing structures in which our policies are dependent on those very companies Only makes them more powerful and our dependence increasingly notable So why has all of this happened? I'd highlight I wanted to highlight just a couple of things and then I will get into what has been taking place So that these are issues of interest hopefully can become more actively engaged I think there's several points that Sort of had events that took place over the years that began to spark a shift Certainly not just in canada but in other countries, but I think most know certainly notably here One of course if we think about some of the election related stuff that i'll talk about in a few moments Had to do of course with the 2016 election in the united states and concerns about Russian government interference and the use of internet platforms and doing so in Canada alongside many other countries raced to try to find ways to address some of those issues On the culture side with respect to broadcast and content I think it's fair to say that melanie new joe lee's proposed approach with respect to netflix fell flat particularly in this province And the message that the government took away from that was that An approach that sought to work with these companies even as some of those companies invest hundreds of millions of dollars Wasn't the way to go that a sharper edge was needed one in which a more hardened regulatory approach Was the one that the public was looking for Globally events took place whether in europe or in christ church or elsewhere that i think sparked in many A rethinking around some of the harms that were taking place online and the view that there needed to be a more proactive regulatory approach by the government here's canada joining the christ church call to deal with Terrorists and violent extremists online Now those are some of the external events i did for just a moment. They'll want to highlight one other Event or individual that was the target of some just awful misogynist content and Targets online and that's kathleen mackenna a former cabinet minister And i mentioned this because when you talk to members of parliament from all parties Not just the current government, but opposition parties as well when they saw what mackenna experienced It personalized these issues in a way that i think in many ways it hadn't previously Now i don't think it should take attacks against one individual for Politicians to begin to realize that it's important to have Appropriate policies, but i don't think there's any doubt that when many politicians look at what takes place online They look at look at it through that framing and have the view that they need to be doing something about it Well, what is it that they have said needs to be done? On the election side, they've already done it and it's actually proven to be fairly successful The view was the best way to deal with concerns around electoral interference online Was greater transparency If we better understood who was advertising and how they were doing so on the large platforms That kind of opening them up providing some sunlight into those activities That would have the effect of either doing away with some of the more problematic stuff Or at least allow the public to better understand what was taking place And so we passed legislation in canada that required the large platforms the facebook and the google's if they accepted Electoral advertising to create a transparency database in which anyone can go take a look to see who was engaged in that advertising Now the response from some was to say well, we're going to just walk away from this all together So google said our response if the requirement is to build this is this infrastructure and create all these new rules We're just happier washing our hands of this advertising all together And we will simply block electoral advertising during the particular period Facebook on the other hand actually did create the database the so-called now meta ad library And it is a fascinating experiment in the way transparency can have a real world impact You can go on this and see whether or not whether or not your local member of parliament has used facebook to advertise How much they've spent if they have who they've targeted in the like it's all available there And what we learned was that when you make this kind of information more transparent It doesn't necessarily stop politicians from advertising on platforms In fact, it doesn't stop them from doing it at all this kind of advertising often works But what it does do is often stop some of the third party a bit more sketchy kind of advertising that in the past might have been disguised Harder to actually identify and find Suddenly once it becomes outed it becomes transparent The impact that it had is diminished and much of that kind of advertising Largely went away Now it's open to debate the extent to which a foreign government was looking to interfere anyway But regardless the approach that we took One that didn't come in with high-handed regulations But said transparency can be effective actually was a fairly effective approach But that was an early Approach and when as we'll see in a moment the approach we've seen in other areas. I think is quite different Quickly on taxation as many of you will know by now in canada Digital taxation is here in a number of different ways. It's certainly here in this province. It's now here. It's now nationally With respect to digital services. So there have been two core arguments around taxation One has been the question of whether or not there should be sales taxes. So to speak hst qst GST On digital digital services, whether that's netflix or some of the apple services or spotify and the like and Increasingly over time as those services became more and more prominent in our lives Constituted a more and more Relevant part of the economy the argument was governments could no longer Overlook the potential revenue benefits that would come from that Much less the competitive implications of some services being required to collect and remit those taxes And other services being exempt if they were located outside the jurisdiction And so where we find ourselves now in canada is in fact that these sales these taxes apply More interesting. I think is the so-called digital services tax something that we don't have yet But it stems from the view that the large tech companies in particular Don't pay enough tax on the revenues that they generate within this country And this isn't an issue that is playing out only in canada It plays out in a great many jurisdictions around the world as countries look at these companies that are clearly Doing exceptionally well for the most part and from a taxation perspective have often been able to structure themselves Whether by way of the intellectual property that they run where they where they situate themselves from a corporate perspective and the like Outside of jurisdictions to engage in tax minimization Now, of course every company every one seeks to engage in tax minimization But this has become more pronounced And so what we've seen Is increasing number of countries established policies where they want to charge a sur tax on tech companies called a digital services tax Basically trying to tax some of the revenues they generate from our data and from some of the other benefits that accrue within the country There have been attempts to create a national agreement an international agreement on this issue And the approach that we are seeing right now is that canada has said if this international agreement does not come to fruition Is not implemented by 2024 canada will begin to levy this tax So an attempt to make sure these companies pay it starts in 2024 Let me in my last few minutes talk specifically about the legislation that is those are things that are basically done Let's talk about a few things that are forthcoming First bill see now see 11. It used to be c 10 This is the now called the online streaming act and it has a fairly controversial history Although it started life pretty uncontroversially back in november 2020 is bill c 10 Essentially billed as legislation designed to ensure that large streaming services like netflix and disney amazon prime Made contributions to the canadian broadcasting system in much the same way that conventional broadcasters are required to Now there is an interesting and i think important debate that we could have About the contributions those companies already do make In fact if you take a look at the funding for film and television production in canada right now You find the largest growing source indeed now almost the most important source other than tax credits is in fact foreign funding It's the netflixes and amazons that are paying for much of the production that takes place right now When there are claims that they don't fund canadian content That overlooks the fact that they can't when just a decline the film that netflix funded to create here in covec With covec crews covec writing it ticks every single box that you would want to have about can con But for the fact that netflix owns it and if netflix owns it it can't be can con So when disney creates turning red based in toronto or amazon creates a mini series on the toronto maple leaps All of these things seem pretty canadian with canadian crews and the like yet They are not canadian for these purposes which is why you often hear these companies don't create any can con It's because the system doesn't effectively allow them to even if they are providing big funding and doing so But nevertheless, it's not all that controversial to say let's make sure that they provide these additional contributions Where this became controversial was when the government said not only are we going to apply these rules to the big streamers We're going to apply them as well to user generated content To to the content that you and i create or that is being created right now Let's say on the twitch stream that this that this talk is going out over The view of the government was all of that are programs that might also be or should also be subject to some form Of regulation that sparked a fair amount of controversy the bill didn't pass It is now back as bill c 11 And while the current heritage minister pablo rodriguez has said that he has fixed the issue Just last night even the crtc chair ian scott said no No, the reality is the bill does give us the power to regulate user Content frankly anyone that reads the bill can see that that's plainly the case Well, what does that mean? Get into the issue of regulating user content in this world part of it means Discoverability requirements where the crtc could go to youtube or tiktok or instagram And say we want to ensure that canadian content gets prioritized in what people see Which sounds great unless you're a digital first creator and it's not even clear that you can be identified as canadian for the purposes of Some of these services. This isn't something that they typically Even have information on But it gets worse for digital first creators, especially those that generate most of their revenue outside of canada Because the algorithmic choices that take place, let's say on a service like youtube looks not only at what people click on But also what they choose not to click on And if people's youtube feeds are suddenly filled with content that is there not because the algorithm Tells them that this is something they might be interested in but rather because a broadcast regulator says This is the kind of content that has to be included The likelihood of clicks is far lower than the kind of content they're looking for The problem here is that the message that sends to youtube and its algorithmic choices that this is content People don't like when they're presented with it. They don't click on it And so what you end up with is more potential views in canada But the multi-billion person Audience that is located outside the country that so many of these creators depend upon Well that those people suddenly find themselves demoted elsewhere And so bringing in the government to say that they're going to regulate this kind of content raises kind enormous numbers of concerns Now we've also seen the government move forward with Bill c18 online news, which also I think starts from this view that you know, who could oppose We know that many of the news services right now are struggling We should find a way to support them And the approach that the government has chosen is to follow a model that started in australia It basically starts by saying let's identify the platforms that we want to make sure make contributions Google meta facebook perhaps twitter apple Linked in microsoft and some others And then let's decide which so-called news organizations qualify and the government's taking a very broad approach It's not just traditional newspapers. It's broadcasters. It's the cbc. It's radio stations All of these are potentially subject to it In fact, you don't even have to be canadian in order to qualify for the system So long as you have a couple of canadian journalists Let them negotiate a deal if they are unable to If they are able to negotiate a deal, let's have our broadcast regulator review it to decide if it's good enough If they say it's not good enough Then they say sorry not good enough and you go to a mandated arbitration system in which both sides Must put forward offers about what they're willing to pay And the arbitrator will have to make a decision and the government even says even dictates some of the kinds of things That can be offered as part of this process Now from the government's perspective, this was said in the house just a week or so ago. It's simple Tech giants need to compensate canadian journalists when they use their content. That's it. No more. No less This is a market-based solution But when we take a look at what the legislation says Take a look at what use so-called actually mean now I should start by noting use isn't even a term in this legislation doesn't talk about using news It talks about making news content available Well, what according to the canadian government is making news content available? Well part of it is reproducing the news So if you took a news story and you you made it available in full text Let's say in a facebook chat or on google That's something that would be required to be paid for and I think many would say well We can understand that but take a look at the second part of this definition Access to the news including any portion of it that is facilitated by any means Including an index aggregation or ranking of news content Think about some of what some of that stuff might mean even the first one news content or any portion of it Put forward a headline on google that or on on facebook That's making a portion of it available create an index of montreal based newspapers, which google natural google does Just to their front pages not to actual news stories That's making that that's facilitating access to news This idea the government's view is that all of these things are somehow of value not to the news site That's actually getting this traffic delivered to them But instead somehow google is benefiting from sending that traffic on to these sites not copying the news In some instance just sending it along Now at the moment this only applies to Or would only apply to the large internet platforms But think about what the government is actually saying here the idea that Based search indexes might be compensable in terms of requirements to pay Linking might be required might be compensable in terms of a requirement to pay And then ask yourself what makes the news so special If it's news requires payments for links if it's news that gets to require payments for inclusion in a search index What is to stop anyone else from saying hey google you put my website in your index? I'd like to get paid too Hey facebook someone created a link to my blog shouldn't i be paid as well? And suddenly that kind of free flow of information that we all depend upon is lost Now this slide you cannot make out But this is just a slide that highlights the number of of issues and policies as part of this bill That are dictated by either the government or the crtc I put this up to tell you this is what the government thinks Is limited market interference It's deciding everything from who participates to what these agreements can look like Ultimately to approving the agreements themselves Now finally in my last two minutes the last issue so-called online harms now often called online safety Issues around hate and terror and the like online Now on that process the government conducted a series of private consultations and had planned to introduce legislation But held off given the controversy over c10 And I think the prevailing view that the then heritage minister steven gibot was maybe not the best communicator When it came to that bill much less a bill dealing even more controversially with online harms issues so what they did instead was hold a consultation last summer And made the decision not to make the submissions available to the public instead They issued a what we heard report in which they said that they summarized what they heard They acknowledged that there were criticism and they said we'll create a new panel to take a look at some of these issues Now I should note that I launched an access to information request that said hey I think if you launch a public consultation you ought to be make you ought to make the submissions that you receive publicly available Now I got those submissions about a month ago and it turned out that the government's report didn't tell the whole story You had companies like twitter compare some of the policies to internet policies in china and north korea You had groups ranging from muslims groups and jewish groups and other vulnerable groups all saying that the kinds of risks that this put forward Were significant in and of themselves in other words the legislative proposals that were Designed to help support and protect groups actually were creating potential harms And the reason for that is that the government envisioned things like proactive monitoring requiring platforms to proactively monitor chat and activities On these sites and if they identified content that they thought Violated some of the laws to proactively disclose that to law enforcement Think about the possibility that you post something without context in a chat And an ai sees this and for whatever reason thinks this represents a threat and notifies the police as part of it The government thought that was a good idea the government thought that requiring platforms to remove content Within 24 hours without any due process or review was a good idea The government thought that website blocking for those platforms that wouldn't block that kind of content was a good idea Let's conscript all the various telecom providers to build in blocking technologies and then require them to block these sites That's why you saw so many speak out against it So let me just conclude by saying there is a lot happening There has been a major reversal In the way that the government has approached these issues and I would say candidly and quite discouragingly That despite some of the voices that we've seen to date They for the most part have been largely ignored the government is marching ahead Sometimes in my view engaging just plain gaslighting in which they say that the legislation says one thing when anyone who takes the time to read What the legislation actually says realizes it's the opposite That said the time is now for these policies. They're going to be hearings for example on c 11 in the next few weeks There is an opportunity for concerned canadians to make their voices heard regardless of what your view happens to be We deserve policies that reflect the broad view of canadians and and is properly informed by those who are expert Not just on some of the speech implications, but on the technical side too There's an opportunity here, and I hope many in this audience will grab it. Thanks very much for your attention Thank you Okay. Hi. Thank you for the great introduction And I've been working on offenders behavior and decision making for like more than seven years now And I will continue in the future, but today I wanted to talk about the protection of users So because because we will talk about passwords So password best practices imply that you Uh, that Yeah, that the password is impossible to remember and that it is never written down So the idea of protection behind passwords is an excellent idea but Like every technologies come with consequences And this solution seems to have been made more for computers than for humans because it's impossible for humans Who just remember a list of a hundred passwords with strings of random characters. So I know that today we have solutions like like password managers, but Most people don't use that yet and these we still need passwords to to use this so um studies show that there is a difference in cyber security knowledge or literacy Um Across the country. So there is a difference between the level of of knowledge across the country And this is what I wanted to Explore in this research project Each year the company NordPass released a list of the 200 most common passwords by countries The list of passwords is a compilation of of cyber security incidents that happened in a year And It comes from of course data breaches. So continuing users passwords And um, the list is compiled is is created from 4 terabyte of information. So a lot of data breaches and it includes Well this year, I don't know if it's the case for each year, but this year 49 countries So here is some more information about The the list the the data that I use for for this research. So, um, the 200 most common passwords for each country comprises between 169 thousand and 146 million user per country. This means that In some countries, there is 100 60 46 million people who use the 200 same passwords But it also means that in other countries, there is a lot less people using the same 200 passwords and that is very good for them Here the list contains the most common passwords not necessarily the worst password So to determine how good is a password you can observe the cracking time So when an attacker rob a list of username and passwords, the information is usually encrypted And time to crack will be the time it takes to To decrypt the information So the average time to crack password is for in this sample is more than two million seconds So it ranges from zero to uh, to three billion seconds And here the mean time to crack is not a very good measure to understand the sample because It's very high and it's like everyone is very good at Creating passwords, but it's not the case The vast majority of passwords included in the list can be cracking less than a minute So the fact that the mean time to crack password is high Is is it's because there there is some pretty good passwords in those 200 most common passwords Um So a much more representative measure would be the median more than the mean So, uh, which is uh, if you, um, order, uh, number as ascending numbers Uh to with the cracking time you Yeah, so right in the middle the median is like right in the middle you have the two second That means that most people use like passwords that can be cracking less than two seconds um This lead to wonder which countries are the best and the worst in term of password performance I base this calculation on the mean time to crack the maximum time to crack Present in the list the number of users sharing the same password and the percentage of password that can be cracking less than a minute So here is the list of the best country To be in this list the country has to be the best In two two three criteria that I just mentioned um The country with the little stars uh indicate that Yes, they were the best in two two three criteria, but they were also come up They also come up as the worst in the top 10 worst for one criteria. Okay, so they are not so good Canada for example We are among the best for the maximum time to crack and for the mean time to crack Nevertheless, uh, we are among the worst concerning the number of password that can be cracked in less than a minute So this means that most people choose weak passwords And but a larger number of people when compared with other countries have better passwords habits Because they use stronger passwords Here is the list of the worst country in term of password performance, of course um So same thing here to be in this list. You have to be the worst in two to four criteria that I presented Um, and if there's a little star it's a little bit more positive here. Uh, so It indicates that you that this country also came out in the top 10 best for one criteria Okay, so let's take the united state as an example Um, so they are the worst in all categories Except for the mean time to crack it means that some people have very very strong passwords and it increases the mean for the country Uh, so in the united states some people have better password hygiene. We can conclude that So why do country do not have the same performance level in term of um, of password Choice um So this lead to the hypothesis that a characteristic of the countries Contribute to the overall performance of uh, of password strain So in other words, which macro social variable predict the overall performance of a population in term of password strain That was my research question So here is how I proceeded To answer this question. So in order to account for the strength of password. I took into consideration the mean time to crack Um, remember, I said the mean time to crack was not a good criteria to to understand the sample earlier I said that but now because we want to measure, uh, the con we want to compare the country's overall Now it's a very interesting measure to see to look at um, and then Uh, several macro social variable have been considered to create a model explaining or predicting password strength This was the the goal of all this Uh, so a total of 29 different measures have been considered in the exploration of possible models Of course the the the model do not contain Those 29 variables, but here they are Um, so I I try all that I look at all that Here's the the list I've been testing So the lists are the most recent data available coming from official sources like wall bank, for example Um, so this was too many variables to possibly enter in a model Of course, so many tests have been done to for selecting those that would form the model the final prediction model so, uh one example of a test that I've done before Try starting to create the model is a matrix of correlation So variable that correlates too high together would create a problem of multicollinearity For our prediction model. So we need to avoid that. So for example Um, here the number of internet user was highly correlated with the level of digital adoption in a country That makes sense, right? Uh, so it would have been an error to put those two variables in the model So we had to take one off Same thing here political stability is correlated strongly with regulatory quality But also with control and corruption and with governance effectiveness and with a bunch of other variables So it is an example of variable that Is not interesting to put in the model. Okay, so you're following me This helped me to select and try the different variable from the list Then I use a multiple linear regression to build a model of prediction of password string So I try a dozen different model. Um, to finally identify the variable that were in fact predicting password string So after many tests, I kept six variables Um From the list I previously previously showed you So here I present them to you. So first voice and accountability Um, this is one of the six components of governance indicator As uh stipulated by the world bank So it reflects the perception of the extent to which a country citizen are able to participate in selecting their government As well as freedom of expression freedom as of association and freedom of press That means in other words that uh, it gives an idea of the overall liberty of the population So let's see if this have um a link with the password string Then there's the global security index, which is a trusted reference that measured the commitment of countries in uh in the investment in cyber security Okay, um, so do countries invest in cyber security and is this having an impact on password string And then digital skills was my third variable It represents the extent to which the active population possess sufficient digital skills. And this includes computer skills basic coding Uh digital readings or a bunch of of things Cyber security exposure index, um, it's it's based on data collected publicly Available on publicly available source like on the dark web on the deep web on data breaches So based on that we want to to calculate the exposure of a country. So how many attack did they suffer in a year? Then the literacy the level of literacy among a population measures the the percentage of adults in the population Um, who are able to read in and write in their own language So a higher literacy rate is an indication of higher Standard of education. So this was an interesting variable to put in the model Finally, we tested the GDP per capita. So the gross domestic product, um a population It is a standard measures of the value Had it created through the production of good and services in the country during a certain period of time as such It also measured the income earned from that production or the total amount spent on final goods and services So it is a variable that is strongly associated with all kind of aspect of technologies In in country. So that's what we see in the literature. So it made a lot of sense to include that in the model To see if it also correlate with password strength So here is the result. I didn't put the boring Table that goes with it. So, um So the the four the first four Variable were predicting password behavior. So when it increases password strength also increased And the two last did not predict password strength. Okay, and so let's explore each of each of those So the popularity Are the spread of internet in a country have been associated by researcher with a grave greater level of Of voice and accountability, which means the liberty of a population um A strong positive association have been shown between security capacity And voice and accountability in the literature So this goes along with the result of our study Because it was correlated with with a it was a good predictor of of password strength Researcher have shown that a higher cyber security is related to a lower similarity between passwords inside the population And therefore better habits of selecting passwords. So this confirm to confirm This this section of the literature We use the global cyber security index which measure the commitment and investments of the countries And we found that countries investment in cyber security predicts stronger passwords So yes investments in cybersecurity phase The results show that the number of cyber security incidents in a country is positively associated with password strength So the more a country is under attack And the more the more people use strong password So that's very interesting because this suggests that people are sensible to the importance of protecting data With strong password when they are exposed to more cyber security incident The literature have documented that users are well aware of the meaning of a data breach They are we have seen in the past that when a company will Plag or or indicate or notify a data breach people like the the market will Will go negatively for for them will go down for them and We also see that after a data breach most people In a in a study they were saying As much as a 75% of people would change their password or switch the account. So people know That data breach is bad and they know that they have to modify their behavior And this seems to be doing it and this this goes along with all this literature too Literacy is an important aspect to consider from my perspective in this study as it is directly connected to the use of technologies Okay to seek evaluate use information found on the internet readers must navigate to their reading process Uh, if not, it's too complicated. Um So because being knowledgeable is closely related to the capacity to acquire knowledge and that means knowing how to read um It was uh Like people with low level of uh of literacy Would have a lot of uh problem to adapt to and to learn So, um, it is not surprising the result is therefore When the level of literacy of a population increases the strength of passwords increases to make a lot of sense and then These two variables in the model were not predicting password strength But I choose to show them to you because it was really interesting too because we were Uh, we thought that it would uh, it would correlate Um, so digital skills have been defined as the ability to use various digital technology or application Um digital skills have been shown to impact the variety of online behavior However, I'll study points toward the fact that the digital skills is not A synonym of efficient use of protection. Okay, so it We thought that it would go along but it's not a it's not a good indicator I also tested the gdp and you might wonder why but the adoption of technology in a country have been proven to be impacted by manufacturer including the economic growth of a population, so um Underdevelop and developing countries have inferior infrastructure Less effective manpower and their business model did not Did not yet shift between the industrial age to the information age, so Um This result might be uh, sorry. I've been the result of uh, of uh Of the study indicate that well this parity does not influence strong password, okay And this result might be explained by the sector in which developed country invest Also past study have shown that uh countries need to acquire experience with it Before their investments start to pay, okay, so If I want to summarize all this There is this result say that benefiting from from resources is not enough alone to explain Or or to be effective in for the use of technology and protection So in this presentation, I it's my last slide In this presentation, we looked at the factor that most influenced password strain Um Which offered new knowledge about passwords habits and various develop and developing country setting The big conclusion is that yes the environment of user is affecting their behavior and their choice of passwords Um, we the result points are the importance of uh countries investment in cyber security. It also, uh Show that democracies help users to do better choice for password protection Um, this is probably due to uh to the access of information Um Also, there's a a a new hot subject in cyber security and you might have heard of we talk a lot about resistance resilience, sorry About resilience of users. Uh, it's a new hot topic And this is one of the conclusion of this study to uh users are resilient We know that because the population that are victim of higher At uh level of of uh data breaches Will do better, uh at at choosing passwords And finally the general education presented through the level of literacy is a better indicator of password strain than the then specific digital skills So user habits in a in relation to cyber security is frequently examined from a micro perspective Using for example survey results to obtain impactful factors. Um, that That influence individual Decision making But our research differs by focusing on country specific factors and those factors help determine users Vulnerabilities at a macro level and might be useful for policy around cyber security. So that's it. Thank you Thank you so much on Jan. Uh, we're gonna come back in five minutes. So 10 36 See you at 10 36 concept in In the right to be for that specifically about uh, uh new field of research that takes these uh laws that the lawmakers decide and try to use tools from cryptography to uh To analyze to to to formalize the requirement of these laws. So I'm gonna Look today at the right to be forgotten. So there's I'm gonna define at the moment what what what I mean by the right to be forgotten because there could be there's Other accepted definition of it or it's using a more broad setting than than what I use it for this talk So, uh, let's say we have a user who interacts with a server And might use google as a running example throughout this talk and the user Maybe upload some photos to google photo. Maybe upload some files to drive and will Share these uh these files to other users and and google might use this content to to You know image recognition or other other machine learning tasks that to build products upon Now in the future this user might become privacy conscious and ask google to just delete all of its data So that so that there's not a bunch of photos of of that person on the web And this is specifically what I mean by the right to be forgotten in this talk. There is a broader definition where There might be some news articles or some different material content about you online that you wish to be taken down Uh, this is outside the scope of this of this talk and I'm only interested in the setting where where you you can prove ownership of the data that you want to to delete Um So the the right to be forgotten was codified in in the handful of Legislature for the most famous example is a gdpr for the european government There is also A law in california that that has been dubbed the right to be forgotten for for minors. It's a privacy law protecting minors and The other example that I can find is uh, not really a law. It's more of a case law in androentina where the courts Gave rulings that go along with this notion of right to be forgotten This is an excerpt from gdpr Just to know what we're talking about. So I highlighted some some messages So the right to be to be erasure as they call it is that we should have the right To have uh, our personal data erased and no longer in process right and this right Is transitive in some sense it follows the data and not just a user So the this right is extended in such a way that a controller who has made the personal data public Should be obliged to inform the other controllers Uh, that the data is requested for duration and then these other controllers should should process deletion as well And and this this will come this will come up in the later slides as well so there's uh Um, there's an intuitive notion of what it means to to delete data If you have a database you just delete the entry in the database, but what is more? What is less intuitive is what about the data that is processed? So if I if I take the data and I Uh, do statistics on it or a machine learning algorithm or mesh? On it, what what does it mean in that case? to delete the data And also there's the question of of with with the right to erase the data following the data along the data processing How do you keep how do you keep track of this data along this? This line of processing so These laws as you have seen they're written in in human language with which tend to be interpretable and also less formal so so the goal here is to take these These laws and formalize it in the language of cryptography in a formal mathematical definition that the goal is that if you satisfy this definition Then you know for sure that you satisfy the requirements of the law, but you might be You might do more than what the law requires and this has been done in This is the first example of such multi disciplinary work, which has been pioneered by Kubinism in these two papers and what they have done is that they have looked at privacy laws and shown that mathematical privacy notions such as differential privacy meets meets the minimal requirement of those laws And I encourage you to look at the following YouTube talk by Kubinism, which was at stock 2021 Which is a great introduction to this to this this new field Now Let's go into the the meat of the subject. So How do we formalize the right to be forgotten? So we need to have a formal requirement that says if I ask you to delete this data Then you will take some some actions That will guarantee me that that you that you Meet the minimum requirements of gdpr for example And In order to formalize we'll first have to model the interaction and define our participants So we will define three special entities. We have a data collector Which I think is is called the controller in gdpr, which is the the server. It's a google Which collects data offers services and and does some processing on the data We have the deletion requester, which is our privacy conscious user That will ask His or her data to be deleted and then we Have a special entity called the environment the environment Captures the rest of the world right so so so if I'm the user and I interact with google All of this room and all of the the rest of the world goes into the environment and the environment also Is our adversary in this case. So in cryptography, we always have a bad guy That we call the adversary and this and the environment will capture this adversary and we want to limit What the adversary what the environment can learn about The user's data after deletion We assume that interactions have a beginning and end just so things are very precise and Each entity will have Two variables that define it. So each entity has a view which is the All of the message sent and from that entity during during the the execution during the interaction And a state a state is the content of all the all drives all of the floppy disk, etc That all of the information that is known from by an entity and The the game is as follows right you have the data collector that interacts with the user interacts with the environment So so every user Be it the deletion requester or the other users interact in some way with the with the data collector and Each of the each of the protocols that that go on in this way have an associated deletion protocol So I can I can upload something and I can ask it to be deleted Sometimes this this deletion protocol can be trivial So if I if I do a query that doesn't change the state of google then then there's no deletion protocol associated And the deletion requester will will delete everything by the end of the execution And now now at the end of the execution We have all the parties that interact in some way and then at the end we look at these two special variables So the state of the data collector which represents what The data collector knows about about the users and the view of the environment which represent what it has learned about about the users and The criteria will be defined based on those variables So the first the first paper to formalize The right to be forgotten Was by garg gold vassar and vasudevan in in 2020. So this is very recent research and I put strong in parenthesis because they originally called it just deletion compliance and there have been other Definition proposed by by other authors and by by myself which we will see later in in this talk So their definition is based on the real ideal paradigm In cryptography the real ideal paradigm is very useful To define the security of some tasks. So so you you define the ideal Task that you want to implement and then you show that that the real protocol that you're using in real life You're you're indistinguishable from this ideal ideal case So in in our setting we have the real world as we saw in the previous slide And we have the ideal world defined as follows. So What do we want from a deletion? So Intuitively any data that we ask to be deleted should result in a state for the for the Data collector that is consistent with having sending no data at all So this is what the ideal world captures. So in the ideal world the deletion requester Requestor did not send the data in the first place and so the criteria is that if If the real world is indistinguishable from the ideal world Which is captured by this small Approx symbol then The anything that the adversary or anything that anyone can learn about about the data that I've deleted in the real world Is the same that that it can learn in the ideal world, which is nothing at all because I I didn't disclose my data in the first place So this is a requirement on both the state of the data the data collector and The view of the environment Because the authors of that paper argue that once once you disclose the data you cannot You you have lost control on this data, right? You cannot delete it anymore if I publish something on facebook with Absolutely no privacy control then that picture is public and anyone can copy it and I have no further control over this data so another Another advantage of this real ideal paradigm is that if you interact with multiple data collectors and if they interact with each other in A web of protocols such as is the internet then this This notion of deletion compliance is Is closed on the composition, right? So if you have all of these all of these data collectors that are deletion compliant And let's say my data follows this blue line From these data collectors, so I could for example use a service that stores their data on the cloud and then like outsources some analytics or some Advertising then My if my deletion request follows this blue line And each of those deletion if each of those data collectors are deletion compliant Then my data should be should be erased that the the whole system as a whole should Satisfy this requirement that It is as if I didn't send data at all But as I mentioned Earlier, there is some drawbacks with this definition But before Let's see the slide that I forgot I had so so let's look at a an example How you would implement such a definition? So the example is differential privacy differential privacy for those not familiar with it is A property of database queries that says that Whether I query a database on the left or a database with one element removed or one element change Uh The resulting query the resulting data should be should be indistinguishable. So this uh, this query f You could think of as I'm asking what is the mean value of these uh, these these amounts And of course if I return the mean value, it's going to be different on the left hand side from the right hand side So what what differential privacy does is you add noise to the output which means, uh Um That if you add enough noise then then that this 32 000 not being in the data will not be noticeable From the outcome only so if you have a data collector that only exposes Differentially private information through the world. It just holds a database and exposes Uh, differentially private information Then all it needs to do to be deletion compliant is upon a deletion request Just delete the sentry from your database and that's it Now let's come now to the to the strong part. So what are the drawbacks the drawbacks of this real ideal paradigm? Uh, so so in some sense this definition captures two notions at once. The first is privacy Well in this order compliance to deletion request So I will delete the data that you that you sent to me and the other is privacy, right? So we ask that that the adversary cannot distinguish between Uh, when I sent my data or when I didn't send my data and Uh consequence of that is that the data collector cannot share any data from the deletion request or even if that was the intentional use of the serve service for example in a in a In a Facebook or other social network And there's other technical Uh, technical hurdles to this fact as well, which I won't go into so Uh, we will now look at other definitions other Tentative definitions of this of this right to be forgotten. There's no there's there's no right Formal definition yet. I'm just presenting what has been done so far Um, so the first we're going to see is by gao galk mamu d and basu devan, which was just to appear In 2022 at the pets the pets conference Um and their Their modification is as follows. So instead of having a real world in an ideal world where the deletion request or sends No data at all In this case The deletion requestor has a bunch of data and then has deletes some of it and The criteria is that these two worlds are indistinguishable. So the adversary cannot tell which between e0 and e1 was deleted by the by the deletion requestor All right, and I should say I should emphasize that in the original definition of of a Uh gao govassa and basu devan Just the fact that deletion happened is a breach of deletion complaints, right? Just just knowing that the deletion The deletion requestor interacted with the data collector is a breach of Of deletion complaints So this is one attempt to weaken it and one Uh setting in which it can be applied is in machine on learning So machine on learning is a task uh Where you train some machine learning algorithm on on a data set d and You have an associated deletion operation for your for your learning algorithms where if You take a trained model and delete some points you It should be the same as training a fresh model on the data set minus the the point you So this is an example of how you can Uh satisfy deletion come satisfy the right to be forgotten in the case of machine learning and The the security of this task is that no adversaries should be able to tell what was deleted, right? So you you you define it as follows you train a model Then you let an an environment an adversary interact with it Then you delete a point from the model and Let the adversary interact with it some more and then it should not be able To tell which point was deleted So so if you satisfy the previous definition of Of uh weak deletion compliance Then then you should you should be uh secure against deletion inference in the case of machine on learning um another uh Proposed weakening uh, that was proposed by myself and jonathan godin from inverse memorial Which was in in in parallel from the one that we just saw so it's independent work Is based on the concept of simulation So Simulation in cryptography is used for example in zero knowledge proofs. So in zero knowledge proofs How do you prove that you have not learned anything? You show that by building a simulator, which is a machine that does not have access to to to any any knowledge any new knowledge And that can produce the same outcome as someone that that interacts with with a prover So in this setting what simulation Says essentially is that uh, I will construct a simulator that will Not interact with the deletion requester and still produce the same state as the data collector. So so after the deletion I will have will have produced the same state But the trick is that I will give this simulator the view of the environment So so so it will have everything except the data produced by the deletion requester So this is just um A repeat of what I was saying. So you will have this simulator that will Internally simulate these execution, but without without access to this arrow. It will only see the view Uh, the view of the environment So this this simulator s Receives the view in of the environment and produces some output that should be close to what the data collector produces And this is just the definition And The benefit of such a definition is that it captures a broader a broader A broader class of of data collectors, right? Whereas in the in the context of strong deletion compliance You cannot you cannot ever have facebook be in deletion compliance in that setting because uh facebook By instance shares data and as as soon as you share any data from the deletion requester, then you cannot satisfy the deletion compliance criteria So what this says is that instead of Just having the real and ideal world be indistinguishable I will build a machine that will reconstruct uh The state Or I will reconstruct all of the servers of facebook But without the data that I sent to it in the first place So if you if you can show that such a machine exists, you don't have to build it You just have to show that it's possible then then you satisfy this this notion of weak deletion compliance And and how one property that implies this This definition is history independence And so history independence is a property of Data structure implementation So so what I mean by implementation is that for example, if you have a graph Then you can you can implement it At a lower level using Adjusting symmetry if you have a queue you can implement it using a link list if you have an associative map you can implement it using a hash A hash function with a list and these these implementations some times depend on Uh, the state of the representation sometimes it depends on the order in which the operations are done For example, if I have this graph in the upper right corner and I do Uh, I build this graph using a series of ad verdicts and ad edge If I do these operations in in different orders, they will They will result in different representation And what history independence Says is that if you have an implementation that is history independent then this These representations will be the same Regardless of the order of operation, right? They only depend on the content of the data structure not on the Not on the way not on the order in which it is built And for for example in this in this adjacency list I can I can sort Inside the parentheses and I can sort then uh on the first uh first and second node of the Of this list or a sorted list is kind of the kind of canonical example of a history independent data structure Right and now to conclude so Uh The definitions that that are built in this way are intended to be stricter than the the laws they model So if we satisfy the formal crypto definition, then Uh, you meet you should meet the requirements of the law that so that's the intended goal So of course, there's still no right definition of what it means to forget data. So this is very young, uh Work and we're still missing the multidisciplinary approach that is found in the other work by by nissim Where there were legal scholars working with computer scientists to Come up with with these these notions. So for For now this has only been the work of cryptographers and computer scientists Um Even though they seem they seem arbitrary and complex there are tools that exist to easily meet Those definitions. So so we've seen three differential privacy history independence and machine learning And one thing that's very important is that the honest participant of the honest participation of the data collector is is crucial because you cannot force You cannot force someone to delete something you have you need to have someone that is willing to comply with the laws And that will that will do so in in a In a best effort Thank you. Thank you so much Thank you Yeah, thank you so much. Okay, uh, we're gonna do a final five minute break and then we're gonna come back for a discussion If you haven't gotten your questions on slido yet, please do so Um, I will be looking at them as we moderate this conversation, which I think is going to be super super interesting. Thank you That means we're back at 11 11 That and and you're right It's to me the lack on privacy is is a little bit difficult Genuinely concerned with dealing with some of the core challenge In fact, I was talking with somebody during the break who noted that cobeca's act has of course moved forward with respect to privacy And I think we're seeing a number of provinces basically say listen if the federal government can't get its act together We're going to move ahead ourselves. You would be that we can chalk up both I think some of the problematic proposals that I highlighted in the talk as well as the inaction on privacy Frankly simple politics and the perception within government that These can be tough issues to explain and so if you have a simple what you think is a winning policy You're going to run with it the so-called winning policy that I think we saw with respect to C11 and c18 and initially online harms until they got that pushback Was that as long as you were on the opposite side of google facebook and some of the other tech companies You were where you wanted to be in terms of public opinion Not totally sure that that's right, but that was their perspective Privacy is so much more difficult and I think what the government The messaging that the government came away from in terms of the initial reaction to the bill that went nowhere Was that this is really hard? I mean we saw that with respect to the presentations about how hard getting some of these issues can be And so when they took a look at an issue that left those looking for more privacy concerned because they said, you know We don't think this legislation goes far enough And we saw then they got a reaction from the business community that said we think what you've proposed already goes too far There was no obvious hero. There was no obvious villain in privacy It was just hard policy work that really needs to happen And the government basically said why bother? Yeah, you know, I think another thing that really strikes me around The privacy issue and I think it's similar to the competition issue is that um It's it's actually not that hard to build a laura policy You know, we see this with the online harms proposal, which is like the government's job is to just um Deal with all the bad things on the internet like internet bad guys and it's easy to be You know, it's us versus the child pornographers. It's a lot harder to be like It's us making a nuanced balance that is constitutionally respectful of people's competing liberties and safety concerns Like that's it's a harder self But you know part of it comes down to the business model, right? um If you have a government that is simultaneously looking at this sector as an ATM And that sector is financed by intrusive advertising that relies on Uh massive and extraordinary data breaches that amount to like almost incalculable human rights issues Maybe we should like reel some of that back. I wonder like do you have a reaction to that? So, I mean my candid view would be that the government isn't nearly Sophisticated enough in its thinking on policy to make the connection that you just made That the idea that if you what you want out of the platforms is their cash Then trying to limit what they do from a potentially competitive abuse perspective or limit the amount of data that they collect or how they use it Which has necessarily implicates the amount of cash that they can generate that then they give that then they can be used for your policy objectives Which I think is right Frankly, I I don't think I don't think that they're within miles of being able to make that connection I think this is you know from the perspective of certain politicians It's a tick box on a mandate letter that says I've got to deliver on some of these issues And in fact, I think you can even go further and make the argument that Some ways they know going in that on some of these issues They haven't even figured out all of these complexities They've decided to punt many of those issues to the regulator to the crtc We're to a yet to be forthcoming regulator And you know yesterday we had the head of the crtc saying it's going to take at least two years to sort out the c11 stuff And that's before there's judicial appeals and And reviews and the like we're talking about years and years and by that point in time if you're Pablo Rodriguez It's somebody else's problem That is incredibly depressing. Okay. I I I want to come back I want to come back to you Michael because um, I think that there's a real interest in this audience In particular to think about how to get engaged around these issues But I want to ask a question to Andrea and there were two things that kind of seemed a little bit Counter-intuitive or surprising about your findings to me one of them With this idea that stronger passwords were more present in places where there were Kind of would say maybe stronger democratic institutions or greater liberty higher freedom of expression metrics and things like that Uh, it seems counter-intuitive to me maybe because you think that people who live in Places where there's weak democratic institutions would have to take more of a sort of proactive or like Like a self protective approach to these kinds of issues Or maybe it's the inverse and people are are jaded and they don't have a strong desire to protect those issues Because they they don't feel that they're going to be respected by their government So i'm really wondering what your reflexes are around What you think explains that and then the other thing that I thought was interesting is this idea that Higher level of digital skills don't necessarily Result in strong passwords. And so i'm wondering like what are your reflexes around What explains these two conclusions? I think it's an interesting link, but do you have a hypothesis about like why? So of course like the the the result that I presented today are kind of a preliminary analysis I can go much further in those analysis and I will in the future But like for the second one the digital skills I think that yes, you can you you know You you can know How to use a computer? But it does not necessarily means that you use it in the right way or that you are aware of all the Not the consequences, but the the bad aspects of of using um technologies so It's just like just the The hypothesis or the conclusion that I can say for now is that it's not they are not synonyms. So a good Yeah, good use of computer is not synonym of knowing how to use it. So that's uh, that's How far I can go for now, but in the future. I hope I can answer this better and uh for the other one it was um, yeah So in democracies, I I mentioned it really quickly in during the presentation, but for democracies There's a lot of other variable that come uh into play so first, um Well, and most importantly the access to the information So when you have access to information, you can be more educated and you can uh, it's you are more It uh, it's you you have higher probabilities of having um seen how to protect yourself and and you can yeah educate yourself and go further in those. So, um This is how I would explain Why in the the result of the study Well, okay, that's that's really interesting one other thing that sort of um I was curious about when I was listening to your talk was whether you think or whether you have a Whether you know in the academic literature or whether you have a sort of gut feeling or hypothesis around whether having Stronger passwords people who have stronger passwords are more likely to engage in other kinds of good security hygiene or more likely to for example use encryption technology or anonymity software Or do you think that there is something particular about passwords that maybe Makes your findings not generalizable to those other kinds of user behaviors So my think on that would be that yes, it should be related But I don't know because I didn't study it directly. Okay, but um all the the variable that I put in the model Of my presentation today. It's because I I didn't put it like Like randomly, right? I read the literature and it made sense to put them there because Those variables were influencing other technological behavior For protection and for other behaviors. So it made sense and I think that in the end it would be related But uh, I think it's much more complicated than that and it might be related in complex ways So maybe in the future, I'll be able to answer that Cool. Okay, that's that's really interesting and I think that There's probably a lot of people in the room who have kind of an intuitive thought one way or the other about the answer To that question and so I'm sure there are people who will pitch you some research. Um Okay, so first I have to apologize I was sure that there was going to be some blockchain in this pitch as a lawyer When people are like the right to be forgotten. How do we deal with this messy problem? It's it always involves a ledger people just love the ledger and so I was surprised So I was really I found your talk really interesting and I was really um It became much clearer to me that The problem the technical problem you're working to solve is this problem of deletion in particular and as as a lawyer Um, I was so struck by your comments at the end around this this interdisciplinarity and how uh, how um, how big the problem surrounding this question is because on At least in law the deletion looks like the easy problem, right? The hard problem is what do you delete? Who decides how do you balance competing interests? Uh, do you keep logs? Are there mechanisms for redress? And so all of those questions sort of surround Uh, this issue for me one of my questions for you is what What made you interested in working on this technical problem? Um, and like what do you see as future directions around that interdisciplinary work that needs to be done? Yeah, so so unfortunately, uh, crypto has become a synonym for a cryptocurrency Uh in the in the general public side, so um So that's where the confusion came from. I guess, uh, so to answer your question. Yeah So the the tools are presented the definitions everything It cannot tell you what to delete or what what to keep right? So this is a question for for law makers to decide or for for system designers to see what's useful. What's not Um, it just tells what just tells you what what what does it mean? Uh, in terms of technical aspect What what should what should you expect when you ask something to be deleted? In terms of purging everything, right? So so the differences the definitions that I presented are very strong in terms of they purge everything like from from, uh The data being stored through the the data being processed in terms of machine learning There's there's this machine on learning where you where you delete this data from from from the model itself and So what got me interested in this question Is actually just tumbling upon the paper of of garg gold vassal and vasudevan and and just noticing that this This definition is very strong. Uh, it's unlikely to be satisfied in practice And and if you want to to have an impact Using this definition, you need to have something that that's applicable to everyday life, right? So so when you have web, uh web servers like the google facebook amazon of this world They provide a service based on the data that they collect and You should expect to have the right to erase your data And so it makes sense to define what it means to delete the data from their from their servers without without having to To restrict the services that they that they offer Hey, and one of the questions I had about your research and I might have just missed this bit But it like what are the use cases that you imagine? For for this kind of approach like are because there are some questions and then chat about things like people's conversations So are you imagining? The deletion rules that that that you set out here kind of applying to Information that is public in character or information that might have where there might be some kind of joint privacy interest So so that depends on uh on each of the definitions that I presented so for the strong variant of deletion compliance, uh like a chat A messaging app Like cannot be deletion compliant because you're sending out your information to another to another person That has a local copy of it And so unless you ask of every individual to delete the data that they hold about it everyone else You're not going to have a satisfactory notion of deletion compliance And so for the the definition the weaker definition that we propose You have this This this message that you send to someone else now is part of the the view of the environment that can be used to to reconstruct the the Like the content of the hard drive of the server right the the server can know this stuff because you shared it to someone else uh now then then Uh There's also the question of whether you should be able to the chats messages in in a private Private conversation, but that's that's not something that that cryptography has an answer to Right it's it's maybe more of a political question something I was thinking about when uh when you were talking around the right to be forgotten It kind of comes up in some of the slido questions too And it's something I was thinking about in michael's comments about the link tags or these online harms problem Like the and actually it kind of goes to the password issue too Like part of the problem that all of you are dealing with in very different ways is like the classical problem of The internet is a machine that copies bits um And on every level whether you're talking about uh password security Whether we're talking about freedom of expression in the context of news or whether we're talking about the right to be forgotten We're all running up against the problem of the sort of like fundamental nature of what the internet is I'm wondering if any of you have reaction to that maybe starting with michael and in your conversations with policy makers or Yeah I'm glad i'm glad jesse. It's it's an interesting. It's I think it's an interesting point in you, right? It does Flow throughout the throughout the various presentations I it actually your question brings to mind a comment that was made just last night um In one hearing where the head of the crtc was asked about regulating the internet And his response was i don't regulate the internet or the crtc isn't in the business of regulating the internet We regulate broadcast And what you don't need to understand is that his view Is that broadcast within the context of canadian law Is all audio visual content wherever it originates on whatever network it happens to exist on Now where I come from that's pretty much the internet Um, but if you're a broadcaster You don't make that you think that those bits are broadcast bits And that there is a role to play for a broadcaster now a broadcast regulator now It may well be that on certain issues there is a role to play But I don't think that the starting point is that all bits are broadcast And one of the problems that we face is that the starting point from the canadian legislative perspective And certainly at this point in time the head of our regulator Is that this looks like broadcast whether it's the twitch stream of this panel Or a or a movie or a program on a mainstream commercial channel from their perspective All of this is the same and we can apply the same regulatory rules in that way And I think your point about the need to have a I think a more sophisticated Understanding about what's taking place technically and have policies that respond to that has oftentimes been missing Um, I don't know if you have thoughts about that that part of the problems you're trying to solve are The problems of undoing what the internet does best So so when it comes to to this broadcast broadcast versus like I guess a private line going out of your house Now what is a lot of the internet is broadcasted as well if you you have a cell phone you just broadcast the signal and the antenna catches it and What makes it none like What makes it not really a broadcast is that you use encryption to have a secure line between between yourself and and the provider, but So so maybe like there's there's a there's a nuance in the definition and maybe there's something to to To address there Um, and I wanted to touch on the like this the the way the internet works right now so in the past 20 years there's been a Really a centralization of the internet where you have these these giants that form really the backbone of the internet that we use every day and back in the days it was Mostly everyone could host their own infrastructure. I could I could have my email server at home I still can but it's a it's a lot of a hustle and One of the big hype recently about all of this web 3.0 is kind of Like people wanting to go back to this this original state of the internet where where everything is decentralized and there's no big entity forming forming Having huge power over the internet But unfortunately you see as as this becomes more popular You also have a tendency for things to centralize Like the the NFTs you have these big platforms that essentially host The image that you bought So if they go offline then you're left with just a couple of bits on on a public blockchain. So I guess Right and I wanted to add something that that is also a pass from ryan Which is that cryptography has a lot of tools to protect our data online. So you have encryption you have Like ld password authentication. There's a lot of tools to have a safe Life online and have privacy respected. But the downside is that you have to have strong operational security like you have to have strong passwords and and keep and not lose them. So this is some somewhere where If you lose your password and you lose all your photos on google photos and then you're going to be mad So this is why google just has this one authentication for all of the services and and And it's not end to end encrypted because if you use if you lose your keys Then Then you're you're going to be you're going to be distraught. So this is something like that comes to the adoption of these technologies by people and why why secure passwords are a must That's really interesting and ran do you do you have thoughts on on that issue on that question of you know We could have really strong passwords, but if we're sharing them for example or if we're engaging in other kinds of Behaviors that are insecure. I mean it doesn't really matter. Like how do we measure? How do we think about that in a holistic way? Well If if I made myself a utopian, okay, I'll say I'll say that Like I introduced the subject saying that it's impossible for a human to remember all those passwords, right? But the The the idea behind the passwords is excellent. It's just that it come with consequences. So Of course, we can find solution as you say but In a utopian society, we might also just rethink the whole thing and just start from scratch and we we see that appearing a lot more like with alternative of passwords like a recognition and Two authentication factor is an example or things like that. So Yes, it is we are starting to think differently and this I think this is the Like the future how it should be it should be like technology should be uh reflected For a human to use them and for a computer to use them I really like that it's sort of um like there's one of the questions in the chat Which is basically like what's your take on passwordless login? Is that the solution to the path like our password's dead? I think it's a really interesting question because it kind of goes to flip's comment also about Like the role of centralization and intermediaries, right? So it's it's one thing to go like well people are really bad at remembering passwords So, uh, we'll let our browsers do it or we'll let the password manager do it or we'll let our retinas be our passwords, but You know all of that comes with different trade-offs. Do you have a do you have a take on is the password dead? Should the password be dead? What and if if not that then what? Yeah, uh, yes, I think it should be dead because it should not Just because the way it is it was invented. Uh, but I think it was a great solution it is a great solution and of course there's uh, things to help and solutions for that like password managers and everything And we should all use that right? But most people don't my mother doesn't know that it exists for example and she writes everyone of her passwords on a sheet. Um, so No, I just think yeah the future in the future. There's no password. I'm I'm positive Maybe I can have something to that. Uh, so so the Uh, I think there's a lot of bad password policies going like around in in enterprises and and everywhere where Like we know how to make good passwords that are easy to remember. Yeah, right? You just take a long list of words you throw some dices and you and you select the ones that that that are That are chosen by the dice Uh, but of course all of these policies they force us to as as the special characters and numbers and such so That makes for passwords that are hard to remember and that was also A bit of a hierarchy like the passwords if I if I have to use a really complex password to unlock my computer every 15 minutes when I walk away Then I'm gonna be tempted to reuse the same one over over and over again and just increment it So so you can have really strong password for some uses but some other uses just a a key code should be should suffice Yeah, I think that also sort of reminds me of um, you know the point you made earlier about user behavior Like um, I I actually Um, I texted somebody about a surprise party They weren't supposed to know about using a signal and I feel like it's like the perfect example of you know world class encryption encryption technology and zero op-sac and I think Um, you know, and so there's only so much solving the human factor that we can do on on the technical end Um, we're supposed to wrap up in in in five six minutes. Um, I want to kind of circle back to to michael um You know and it's interesting too because there's actually questions about the right to be forgotten as a policy Question or as a legal question and I think that there's a a really neat Intersection between this government's approach to media policy and also this online harm stuff, which is you know, ultimately Coming down to the theme that it's up to the state to set up institutions to remove certain kinds of content from the internet, uh, whether that is Information that is in breach of privacy laws or information that might be harmful or information that is You know increasingly in the online harms content. I what you know, what what we often call awful but lawful So things that are offensive or perhaps insensitive but not not strictly speaking criminal or illegal content um, I'm wondering if if you see, um When you think about these issues Do you think first about the sort of um Removal mechanism and framework or do you think about the type of harm first or the the type of right being invoked? And how do you think policy makers and technologists should think about those issues? Another really good question I mean, I think I think the way the The the government and policy makers have been thinking about it is They see a harm and I mentioned, you know the McKenna example during my talk They see a harm and they think that something and they believe that it falls to them to fix it and Their view would be the intermediaries the platforms in a fair in a pretty centralized as we've heard world Um, either don't have the incentives to do it or inconsistent in how they're doing it We're just aren't doing it. And so Their view would be if they're not willing to step up Then we are then it's incumbent on us to us as the government to take that position to do something about it I think that they're I think that there has been less Sort of analysis or or even bringing on board The costs associated with Taking some of those policy approaches so freedom of expression, of course is an obvious example Building in surveillance infrastructures in the name of solving these issues or blocking infrastructures In the name of solving these issues. There's been little by in the way of Seeking to think through well, what does that mean? If for example, you you tell every major internet provider in the country, we want to ensure that you've got uh blocking capabilities and we will set up a system whereby you must respond to our blocking orders Um, immediately or in a short period of time and they could say hey, that's going to solve this particular harm Without I think enough thinking about what are the new harms that you're effectively creating in the process Yeah, I think that's really interesting and maybe to like where my citizen lab hat for a second I think that there's really like the way we talk about these issues has become so much more politically Sophisticated and it's like the government hasn't caught up but sort of like we need a grand unified theory of bad stuff on the internet and how to take it down and um like The the politics the theory the discourse around and the law has moved on from that because we understand that if we look at the evidence When you build technical infrastructure to remove Quote-unquote harmful content in almost all cases we see that the content that actually gets removed is is uh, Or the people who end up uh censored and surveilled and monitored and policed as a result of these kinds of interventions Are overwhelmingly vulnerable and marginalized people and I'm wondering if you have a thought on that Yeah, no, I I think obviously, I think you're absolutely right You know, I think the government was frankly shocked to see some of those very vulnerable groups Come forward as part as part of that online harms consultation And make the exact point that you just made to say that we know that you're out there with policies that are Supposedly designed to protect us But what you need to understand is that some of your proposals are actually going to put us at risk in in new ways Or additional ways And that you haven't thought through all those policies And I don't think that but that there is any doubt that there's been a failure to Sort of see what some of those proposals are and amplify them into a global context and see what they might mean So when twitter comes around and says, you know what some of these proposals look a lot like what we deal with In china, north korea, and iran The immediate response which they did the say the immediate response to the kenyan government is how dare you say that We are quite clearly not china north korea and iran and we're not from a democratic perspective But they don't seem to fully appreciate that when you start proposing blocking solutions without due process You're you're setting yourself up as a model for those jurisdictions and those jurisdictions to say hey If this was good enough for canada, what's the difference and while we may have some safeguards those countries don't It's just not the company. We want to be keeping in in any way shape or form Yeah, that's that's that's really helpful. Okay with our last couple minutes This is an exercise. I like to do with like pretty much anyone who thinks a lot about the internet and all three of you do um I want to like bring you back to like your earliest memories of using the internet and all of you are maybe at a slightly different generational place there and so I think that that's that's great like Like What was magical about that internet to you? What what was like? What was the possibility and this is like maybe something that people in the audience are thinking about too and like if you could If you could make our internet the one we have today more like that The thing that captured you Or that those first memories or you could wave a magic wand like what would you change? Whether it's technical or political or legal Um, I don't know. Yeah the internet you've always wanted. We'll start with Michael. We'll go in order Thanks for that. We're gonna start from the early days of the internet. Um So well, so I mean, so I'll tell you that my first I have two memories But the the first one where I was actually using it. Uh, I was a law professor at Dalhousie in the mid-1990s. I was pretty young uh at the time and I mean, it's just so geeky, but I was writing a piece on um, some government legislation back in Ontario and you know, I remember being able to go on Netscape Navigator and Being able to access one of these documents and and I thought this was just unbelievable I was accustomed to going down to the library and waiting for it and maybe it shows up And maybe it doesn't or interlibrary loans for because I know what that stuff is And the fact was that I just had to press a button and that document was there, which You know, obviously by today's standards is completely unremarkable Um, but by but from the world that I grew up in the idea that we had instant access to information and in this case Instant access information from from governments in an open and transparent way Was truly game changing and I changed a lot of the work and research that I was doing at that time To focus much more on the internet because you could see how this would transform things Um, I think we spend our governments now spend too much time Trying to roll back from that not putting that information online Even though they cannot taking advantage of the this technology in this way seeing What I saw as just this this miracle of accessed information is somehow now a threat And I wish that that our governments not just in canada, but elsewhere Would return to sort of more of that embrace of what that more open instant access to information Means for society no matter what your area of interest might be And i think you're next Yeah, so I have a baby face, but my early memory is Like it's still uh with the internet with the With the telephone cable. So yes, I was born and and So my earliest memory is that you know because we have I add to use the internet when nothing was happening in the house because You know if if my mother had to make a call or if like, you know So it was always calm moment that when I was on on the internet and I was searching for like definitions of things, you know, and no no no idea why I become a researcher Right, but it was like like definition are in the dictionary But this was like definition of ideas and I and I love to go search for for things and it was so in calm moment And today there's this idea of of the internet is is a dangerous place Right. So I wish Just that we could feel this way again in the future. So I'll continue my research about the offenders and the ways to protect Users more effectively That's bad. Yeah, the internet is everywhere all the time fast and stressful and um, yeah A calmer gentler internet. I'm ready for it. Um, philip You get the last word on this one and then we'll wrap So it's it's far from my first experience with the internet But it's my first experience with the the complexity of the web. So I was in uh in cgp and and we had a professor who Who had his gmail account? He couldn't have he couldn't access gmail from from the university charbot campus where he teaches and so For for our end of end of semester project. He asked us to do to do a kind of a proxy Right. We we would we would write It was in php and we would load up gmail or google or something and and just Like change all of the all of the load all of the elements and then change all of the The hrefs to to point to our own address so that we would just redraw the the page to the So that it could be accessed from a different address and and and of course Uh Back then my impression of the internet was that everything was static, right? You have html code or you have php that renders something and then you receive it But of course nowadays everything is dynamic. You have a javascript just generating generating all the content on your page and that was my first My first Contact with this this new this new internet beginning with gmail and and all of these other other products Cool, that's great. Um, I think we're gonna wrap in the interest of time and we're only five minutes behind which is Extraordinary good job everyone. Um, I want to thank um all three of our presenters today I learned so much from each of you the questions were great. I tried to synthesize them in the most cogent ways I could uh, there's lots more on the slide or if you want to go check them out Um, thank you everyone. Have a wonderful lunch come back for more great conference later Yeah, big round of applause for our speakers All right. Good afternoon. Let's just wait for the slides to to come up Cool. So, uh, super happy to be here. Uh, I endorse every Everyone who previously were on stage seeing how great it is to see people face to face Um, so before we start full disclosure I've never seen a real air gap network in my life But in my mind, it looks really just like in this picture Um, so a small castle Isolated cut off from the internet and used to protect the most sensitive stuff Um, top secret documents power grids, maybe nuclear centrifuges Um, and whenever we analyze a malware that is designed to attack such network I'm not gonna lie. There's a little bit of adrenaline rush Because we know we're looking at a tool that the threat actor designed to attack something Of great value and something that probably went unnoticed for too long So, um, yeah, so I'm alexi with fecundo will be talking about How threat actors have been attacking air gap networks With malware specifically built to operate in these very restricted environments And you think such malware would be pretty rare, right, which is kind of true But in 2020 alone four previously unknown frameworks were uncovered and that's what prompted us to revisit that specific class of malware And put all the known frameworks in perspective and see How they work and if we could come up with Effective methods to detect and to to prevent these frameworks from from succeeding And we actually published a very thorough white paper on Our corporate blog will live security.com just a few months ago So all the details are there and today we're going to present you some of the highlights of of that research And so to do that that study we had to Come up with a definition of what constitutes a malware built to target air gap networks because there are no real Definitions out there at least not from the technical point of view And so after a couple of weeks of back and forth fecundo and I agreed on that specific definition So we define it as a malware or a set of malware components acting together. So a framework That implements an offline covert communication mechanism between an air gap system and the attacker And we believe it all started a little over 15 years ago with the infamous Group called said knit also known as apT 28 Who we believe developed and used usb stealer as early as in 2005 And after that followed no less than 16 other frameworks developed by other threat actors. So for a grand total of 17 A few of those 17 have been attributed with a pretty high confidence to known threat actors such as a dark hotel or mustang panda But for the others the attribution has been less clear cut or even pretty controversial But regardless we can state that all of them are the product of nation state actors Hence the title of our research 15 years of nation state efforts And in our analysis we studied all the existing reports the public reports On those known frameworks and compared them On several properties with a focus on the ones that are specifically relevant For in air gap networks environment such as how does the malware Get executed on the air gap side and how does the malware establish a communication channel between the isolated systems and the attacker which is the How does the malware jump the air gap per se? And for this we formalized the anatomy of air gap networks was from the malware operation perspective And we came up with two distinct categories. We've got connected and offline frameworks. So let me show you how how that works So most of the frameworks belong to that first category connected Connected ones and those are built to provide fully remote and to end connectivity between Over the internet between the attacker and the isolated systems And so we'll consider a target network as having two sides separated with an air gap So at the top you've got the connected side So those are computer systems that have internet connectivity and at the bottom you've got the air gap side where all the Systems cannot be reached from the internet. That's where the attacker really wants to get to And that's a fairly typical setup at least so i've heard Because people working in these kind of environments, they still need a connected system to get their emails browse the net and that kind of stuff And and that connected system will naturally be the point of entry for the attacker to get inside that network So now techniques used to gain access on that initial Connected system don't really differ from traditional attacks. It can be email based watering all attacks That's not really the interesting part. What's interesting is the type of payload that will be deployed on that system and One thing that that payload will do That is specific to air gap environment is it will wait for a usb drive to be plugged in the system And it will weaponize it just like a usb worm actually and that will mean Two things first it will copy the malware meant to be executed on the air gap side And there will also be some sort of execution vector that will trigger the execution of the malware It could be an exploit decode documents or something else and faculo will get into more details about that specific part in just a few moments And so then when the drive the weaponized drive gets inserted in the air gap system That's when the execution vector will be triggered and the malware will be deployed And that malware will usually Do some automated stuff like doing some reconnaissance collecting information about the environment The host environment the network environment It will collect files that the attacker wants a copy of And it will store all that data In on the usb drive in a very covert way And that's where the data exfiltration from the air gap system happens Um And again, uh, if I google give you some some pretty cool details about how that data gets copied on the drive and and And the various techniques so that this it doesn't get detected at all But now the data leaving the air gap system onto the usb drive is one part But the data still needs to reach the attacker right? And for that the drive needs to reach again the first infected system the connected system And the malware running there on top of weaponizing usb drives will also have code to recognize A drive that will contain that exfiltrated data It will parse it and exfiltrate the stolen data back to the internet And all these steps usually happen automatically in most of the 17 frameworks we we analyzed But other frameworks will have one layer of additional functionality and they will implement a totally independent protocol To allow the attacker to interactively exchange commands and responses with the air gap systems And so in these cases we'll see two different protocols You'll have one protocol that goes over the internet Between the attacker and the connected system and there will be a totally different protocol That goes over the usb drive to communicate between the connected system and the air gap ones And uh, you could see the connected system as acting as a proxy between the attacker and the real systems of value here In other rarer cases the attack scenario is uh, actually doesn't involve any connected system at all We call these offline frameworks But I think of them as like mission impossible frameworks because in these cases everything indicates the presence of an operator on the ground And that will perform those critical actions such as weaponizing the usb drive Or even physically carrying the drive and plug it in the target the target systems and leave with the stolen data And now I'll pass the mic to facundo who'll give you some some pretty cool Details on the various ttps that that we observed Thank you So uh, so like I said, we focus on the malware properties that are specific to attacking the air gap networks We divided we have divided them in three broke categories All the techniques used to execute the malicious code for the purpose of gaining a foothold In the network or conduct a reconnaissance of potential air gap systems These categories are automated execution Non-automated execution unknowingly trigger Non-automated execution Delivertedly performed So let's begin with automated execution Exploiting the remote execution vulnerabilities is the most effective technique to execute the malware 11 such vulnerabilities have been discovered and patched in the last decade And only two have been confirmed to have used in the wild The most famous one is without a doubt, uh, the Stuxet lnk exploit Which only requires the user to be to view a set of lnk files through the windows explorer to trigger the vulnerability However, it was later discovered by Gaspersky researchers that a question group Fanny malware had Used the exploit even before Stuxet since at least 2008 And even after Microsoft released a patch in 2010 Flame mini flame Gauss malware Continued to exploit it But since the discovery of these malware, no other exploit base automated Execution has ever been observed in the wild to compromise our gap networks For the next category we will take a step back from the complexity of exploit and software Vulnerabilities and focus instead on the human factor and deception tricks In this scenario the aim is to trick an unsuspecting user Into executing the malicious code We have observed three main techniques Uh, the ab the abuse of windows out the run and out the blame future Decoy files to lure the potential victims And the reading Existing files with malicious code for example dark hotels retro malware uses A tool that allows it to replace world documents with rtf copies that contain an exploit that will launch The the draws on on the machine Now at least five of the 17 frameworks have abused out the run or out of play In one way or another Use be stiller and agent btc as well as an earlier version of stocksnet That implemented an out-of-run file that contained both the executable and the out-of-run instructions It disabled the out-of-play to force the user to go to the my computer or use it Or use the entry in the navigation of the windows explorer And with the shell the allow command it added an additional open command That disabled they are to play To force the user to oh, sorry Set an additional open command to the context menu that executed stocksnet If the potential victim clicks clicks on it or double clicks on the drive shortcut Now most on panda custom pluget small world uses a much simple trick It hides all the existing folders and drive and creates An l and k files for each one pointing to the malicious executable on the recycle dot b in folder These techniques preserve the appearance of the clean drive Just one second please Perhaps the techniques under this last category are the most puzzling The analysis indicates that the attackers did not intend to trick an unsuspecting user into executing the malicious code It appears that the concept for the mission was to have a human asset Coveredly execute the malicious components in the target network Now How do you think from a malware researcher perspective? We cannot identify such an scenario Let's take the interesting case of usb corporate by the apt group cycle deck Also known as goblin panda In this case the code running on the connected side responsible for weaponizing the designated usb drives Copies the malware meant for the air gap system in a hidden folder on the drive Without any execution vector So the analysis indicated that the only possible way for the malware to execute It is if someone knows exactly what to look for the malware and launch it manually Now in 2015 we discovered a malware on a mission Nice We call it usb div At the time we could not attribute this sophisticated malware to any known groups It wasn't until two years later When the vol 7 leaks occurred that we began to think that the malware was part of the lambels apt Now a new funding Helped us to narrow down the candidates to an implant codenamed margarita The description of the system fits perfectly the scenario and the capabilities implemented by usb div Uh, the human asset that let let's call him tom to continue the mission impossible team Uh, we'll weaponize a usb drive and create the circumstance on the target machine in which he will have to see certain files On the tom drive. He will launch notepad Plus plus a fire force or the truth creeps And the software will launch in turns silently load the malware and in the background He'll prepare all the collected data for exfiltration now finally on that note Getting the malware to be executed on the target is one part of the mission The collected information needs a way to live at the air gap system and safely reach the attackers We will now present what we consider some of the coolest Ways the attackers have managed to achieve this goal So going back to 2008 Funny is at about too high even for some of the most sophisticated malwares that were discovered later, but Possibly developed around the same time and by groups with the same technical proficiency Funny is what our colleagues from kaspersky dubbed usb backdoor One of funny most interesting feature is that it has the capability to create a hidden storage space in the usb drives That use the fat file system They achieve this by creating a directory entry with a combination of attributes that make it invalid for for the windows parcel So when the windows finds such a case The entry is ignored Essentially making the space invisible this entry contains another offset used by funny to locate An allocated space of almost one megabyte in size which contains the Collected information as well as both commands and the result of executing those commands on the target machines Or the modules that the attacker will want to execute on the system to further augment their capabilities on the compromise system It's also worth noting that flame Used a similar trick by creating an entry with an invalid name that windows will also ignore This invalid name was for a for a special file now Ramsey is a malware that we discovered in 2020 an attribute to dark hotel IPT The attackers came up with a decentralized way to spread the collected information about the system drives As well as the network and other removable drives When ramsey is injected into a process It will hook the close handle api and when the hook is executed It checks the extension of the file that was opened by the process It is a world if it is a world document it will append A special container that encapsulates the collected compressed information The same containers is also appended to every world document found in the in any available drives Ramsey follows the same philosophy to receive commands It will look for other type of files Which might potentially have an appended container with the instruction to execute certain modules or commands at a specific target machine based on a g u id that is In the container All right now how to defend against those those types of attacks If you can remember just one thing from our talk is that It's always always always about usb drives There has been no publicly reported cases of any other physical layer used to communicate across air gaps No electromagnetic signals. No acoustic signals. Nothing is rhetoric like that. It's always Via usb drives So how to make it harder for attackers? Well, of course Disable usb ports on any systems where it's not absolutely necessary That's going to greatly reduce the attack surface But for the remaining systems where usb drives are have to stay enabled There's a way to implement policies in windows to prevent file execution when they come from from removable drives So that's one thing And there are some more complex scenarios where you could deploy some sort of middle box where Operators legitimate users of these networks would connect usb drives Back whenever they would cross the air gap in any direction And that machine would for example remove unwanted file types such as l n k and autorun files And it could perform an anti malware scan as well Of course, we don't really expect an attacker underground to follow that policy But you can still put on put some controls someone shared with me A technique that apparently is deployed in different organizations where that middle box would Also perform a take a forensic image of the drive and combined with some proper logging on the the other systems It would allow a sysadmin for example to spot some Some usb drive that would have been inserted in a system Without having been sanitized first and they could investigate further. So there are ways to to um to at least detect some anomalies Now uh keeping air gap systems updated is also something that could be interesting Here we see the use of of zero day exploits against air gap systems By different frameworks. So Stuxnet used an impressive five zero days Fanny and brutal kangaroo two each easy cheese. We're not sure if it was a zero day But in fact, uh, one days were actually more popular That means that air gap networks got breached by By exploitation of vulnerabilities for which Patches were available The thing is that if apparently some some sysadmin's things that Keeping an air gap a network air gap will will protect against Against attacks, but if the systems are unpatched You've got some sort of like egg model where you've got a very strong outer shell Which is the air gap itself, but as soon as it breaks. Well, you're you end up with a with a big mess Um, so it's not it's really not ideal um So just a few words on the challenges of analyzing that type of malware specifically Not only because the malware is is very Are usually pretty sophisticated and technically advanced, but it's also challenging because Samples are hard to come by Air gap systems don't always run endpoint production. And even if they do they probably don't have telemetry enabled where incidents of Of suspicious files would be reported to the vendor and that creates a huge blind spot for for security vendors like like us And at the same time these attacks happen kind of by definition in very sensitive networks So victims are very very unlikely to share samples with external researchers And even less likely to produce a public report of the incident and describe what happened An example of that is ramsey as facundo mentioned that's a malware we discovered in 2020 And the research started by spotting a trojanized seven zip installer on virus total And we eventually so we analyzed the file and we determined that it was a component meant to run on the air gap side of a network and We started looking for the other component So the one that would be running on a connected side one that would parse The the usb drives and look for that ramsey container that facundo mentioned before And as you can see that's an actual screenshot of our internal wiki For two years that that element has not been fulfilled. We never found that corresponding sample that would that would parse the the actual container so Um Who knows maybe we'll find that sample one day and we'll really understand how the attackers used Ramsey in their attacks, but until now. Well, we're not sure how everything really worked So that's it for us. Thank you for attention And if you ever come across a malware that you believe might be Built to target air gap networks feel free to reach out We'll assure you we assure that will handle the samples with the utmost confidentiality We'll honor any tlp designation you want to assign and Thank you very much. All right. Thank you so much for the introduction. I hope everyone can hear me So let's talk about APT is today. We're going to specifically talk about muddy water. That is an iranian APT I'm just going to take a moment to share my screen so that you guys can see my presentation All right. Hopefully everyone can see my presentation now. Good afternoon everyone My name is ashir malhotra today. I'm going to be presenting about the muddy water APT My talk is titled muddy water from canaries to to turkeys Before we begin for those of you that don't know me. I'm ashir I'm a threat researcher at sysco talos I specialize in malware analysis threat intelligence and different kinds of malware detection techniques More recently my focus has been on disclosing APT operations Specifically in the asians of in the asian continent Right now i'm located out of the united states. I'm presenting to you from washington dc Now this research was done in collaboration with vitor Vitor is also a cyber security researcher at sysco talos He's actually the research lead for my team for europe and asia Vitor was supposed to originally present this research, but unfortunately he couldn't present today. So Which is why you all are stuck with me today. So yeah Vitor loves mobile malware and loves to reverse engineer different kinds of malware samples He's an avid apt hunter and vitor is located out of portugal right now All right, let's talk about the agenda. So today we're going to talk about four key sections We're going to introduce the muddy water apt group We're going to take a look at about Five campaigns that have been conducted by muddy water in the past year or so Then we're going to talk about a very novel technique that this apt has started using recently Namely infection tracking. They basically use a methodology a specific technique to track successful infections across the set of victims And we're specifically going to talk about homemade tokens and canary tokens And then i'm going to take a couple of slides and go through the conclusions and Hopefully by the end of the presentation, you will know as much about muddy water as I do so fingers crossed All right, let's talk about muddy water. So what is muddy water? Muddy water is an iranian apt group. It's also known as mercury or static kitten Uh, very recently. I think at the beginning of this year it was attributed to Iran's history of intelligence and security the mois division by the united states cyber command Muddy water primarily tends to target entities You know, usually government entities in north america europe and asia And the focus of their operations is primarily espionage and intellectual property theft And the intention to establish and maintain long-term access into their targets networks We've seen some sporadic instances of muddy water carrying out ransomware attacks as well But that's a whole different topic of discussion and we're going to be covering that today Now we believe that muddy water is a super group. This is basically an umbrella organization or a conglomerate groups That consists of smaller groups that focus on individual geographies all throughout the world That being said, let's take a look at all the different campaigns that have been conducted by muddy water over the past year or so So now the intention of this presentation initially was to talk about three key campaigns and show similarities between them The first campaign that we wanted to talk about was One that started in april and was carried out through august of 2021 This campaign specifically targeted entities in armenia and pakistan The second campaign that we saw muddy water conduct was against turkish entities and we discovered this in november of 2021 And then there was another A third campaign that was targeting a lot of countries in the arabian peninsula way more prolific and way more aggressive That we discovered in december of 2021 During the course of our research when while we were tracking all three of these Campaigns we discovered that there were multiple overlaps in the ttps used across all of these campaigns Basically a technique would be introduced in one campaign It would be refined and made reliable and then it would be migrated to Another campaign that was being conducted in a completely distinct and different region of the world It's not just the reuse of techniques they observed in in across these three campaigns We also saw a lot of new techniques being introduced as well, and i'm going to talk about them during the course of this presentation as well Now when we started looking at these campaigns and the more we delved into these campaigns and the more research We did and the more stuff we uncovered We realized that in order to present this at You know a really good conference like not sec We needed to go back and take a look at some of the other campaigns that muddy water had conducted in the recent years as well So i'm going to talk through a timeline of all the different campaigns and the different attack instances in this particular template We're going to list all of them and then we're going to list all the salient ttps that we used in each of these campaigns So let's start with the first one the first campaign that i'd like to talk about was conducted by The apt group in march 2021 and this was targeting countries in the middle east um This specific campaign consisted of phishing emails with lures that were sent to targets Basically a pdf would arrive in your inbox that would say hey open up this pdf This is trauma legitimate entity The pdf would contain language that said hey you need to download this specific zip file or this archive from This remote location and execute the file inside of it What basically happened was that the malicious archives Consisted of remote controlled software utilities such as screen connect and report utilities Once the victim executed these utilities on their end point The attackers were able to manually connect to the infected endpoint and then they would start pivoting and start An entirely new infection chain from there During the course of this campaign We also saw the attackers use various types of commodity tools such as ligolo, which is a reverse tunneling software Which can be used to establish long-term communication channels between an infected endpoint and the attackers The screenshot that I have on the screen today is a pdf that Masquerades as a circular from the national media council of the united arab emirates the tent here basically states that We have a new version of the media library. It's available on this link. Please go ahead and click it Don't ask any questions open up the zip archive and execute the file with Inside of it and that's how the victims get infected with remote controlled software that the attackers then can then connect to Now the next attack instance of an operation conducted by muddy water was against pakistan in april of 2021 This attack instance consisted of maldocs being delivered to the victims Usually masquerading as a government document of some sort The screenshot that I have on the screen is from a blurred out court case from a pakistani court This maldoc consisted of malicious vba macros that would reach out to a remote url and then download the connect-wise remote access client That would be done on the system and then the attackers could connect to it What's interesting about this campaign and the reason why we wanted to talk about this is because this campaign was this attack instance Was the first instance of tracking tokens being used by the adversary So for those of us that don't know what tracking tokens are they're basically urls that are embedded inside an artifact, you know an html file or An executable or a maldoc and when that specific artifact is opened up on the endpoint The artifact will make an html call to that specific url in order to register a successful infection for the endpoint with the attackers The reason why we call this An instance of homemade tracking tokens is because the attackers use their own servers assign their own ip's and managed and operated these servers And which is why we're calling them homemade tokens. This is basically a homegrown implementation of tracking infection tokens The next campaign that we observed being carried out by muddy water was against armenia in june 2021 Basically what happened was In this case the attackers would distribute malicious executables that were built based on a builder that they have We believe this is a customized builder and we've seen executables being Generated by this builder and being used in other campaigns as well Uh at the very core of it the executable would drop a decoy document It would also drop and execute a power shell based downloader We also saw the attackers use specific file extensions like dot-con conventions for the power shell scripts that the executables would Deploy on the infected endpoint and we also saw the use of lol bins to instrument components of the infection chain Now the screenshot that I have on the screen by the way is supposed to be a confidential document from ericsson It's a technical guide of some kind and it pertains to viva mts, which is a telecom services provider in armenia. So You know the attackers Know what they're doing. This is basically used to target telecom communication entities in armenia If you take a look at the power shell based downloader, this was a very short and sweet and simple downloader or stager if I may say Basically what happens within it is I've got the code screenshotted on the on the slide deck here But the downloader will basically send out preliminary system information to the command and control server And then it will wait for the command and control server to issue power shell script commands to the infected endpoint to the script and any of the commands that are received by the script will then be executed on the infected endpoint So, you know very short very sweet very simple very tight implementation Nothing fancy here. You know, it gets them a foothold inside the network and allows them to Start executing more commands, you know, manually issuing more commands and start executing more commands on the infected system Now an extension of this attack on armenian entities was also seen again targeting entities in pakistan in august 2021 Basically the infection chain is the same the payloads are the same You know the same kind of ttps have been used against this attack against entities in pakistan We see the use of the same type of executables that use the same exact same type of Power shell downloader. They use the same file extensions. They use the same type of lawlbins However, in this specific attack instance, we saw the attackers use homemade tokens again And what's interesting here is that The homemade tokens used in this case had the exact same IP address as the as those seen earlier In a in a very different campaign targeting pakistan. This is the one from april 2021 So, um at this point in time, um I'm basically trying to color code all of the salient ttps in the slide to show you the Similarities and the commonalities between the various campaigns. So you see some in orange some Common ttps in blue across different campaigns the some in purple So, um, this is basically meant for you to keep track of all the commonalities and all the overlaps in ttps Now, let's look at one of the One of the very interesting muddy water campaigns from november 2021 This campaign specifically targeted entities in turkey. We discovered this in november, but this was Operation this campaign had been operational since at least september of 2021 um In this campaign the attackers took a two-fold approach On one hand, they used executables that acted as the droppers and downloaders And the initial infection vectors for the infection chains But on the other hand the attackers also used different types of malicious documents as well To instrument the attacks now in the case of the executables, we saw uh The executables Deploy a second variant of a power shell based downloader or a stranger and we also saw the use of different kinds of low levels At a very high level, um, this is what the infection chain basically looks like The executable will drop and display a decoy document that is relevant and pertinent to the victims that they're trying to infect It will execute a power shell based instrumental script Which is responsible for executing the power shell based downloader The downloader will then reach out to the command and control server Which will issue new commands to the downloader which will then execute those commands on the infected endpoint Now as in the case of the previous downloader, we see this new variant also is very short and very simple, you know Basically all there is the core of the functionality in this downloader is to Take commands from the command and control server and just execute them non-stop on the infected endpoint. That's it So basically the that they're using very small and very compact implants to establish an initial foothold into the networks Now, uh, I spoke about the sec the twofold approach, right? So the second approach that the attackers took during the course of this campaign was to use malicious pdf That would be opened up by the victim and the pdf would have language that said We cannot display the the the content of the document to you However, the correct version of the document is available at this location Please click on this location and open up this document And that's exactly what happens during the infection chain The pdf reaches out to a remote location that downloads a mal doc The mal doc consists of malicious vba macros that will in turn drop a vbs based instrumenter and a power shell based downloader And then the infection chain is pretty much the same. You know the instrumenter will Execute the power shell based downloader which will download commands from the remote locations and execute them on the infected endpoint Now what's interesting here is The mal docs disguise themselves as reports or forms belonging to different ministries in the turkish government And I've got a few examples here. You know, we saw reports that you know mal docs that masqueraded as reports from the health ministry the interior ministry or from turkey So it's it's kind of interesting that you know, they were trying to this gives us an indication that they were actually trying to infect users who have relations with these specific ministries now Another interesting point to note here is that the attackers started using canary tokens instead of homegrown tokens in their mal docs So for those of us that don't know Canary tokens canary tokens.com is basically a free service that allows you to register a url That you can then put in your artifacts and when those artifacts are run You know the artifact the executable or the mal doc will reach out to That specific url on canary tokens.com and register a successful infection so Now we have I tried to the attackers now switched from the use of homegrown infrastructure for tracking infection tokens You remember I spoke about homemade tokens. They moved from homemade tokens to Canary tokens and we believe that this is an attempt at legitimizing their infection tracking Muddy water, you know a lot of them most of them use power shell based downloaders in one form or the other To establish an initial foothold on the infected endpoints uh However, uh, there was a new campaign in december 2021 and that you know made its way well into 2022 That targeted a multitude of countries in the arabian peninsula What basically happened during the course of this campaign was uh, this is the infection chain You know a phishing email would arrive in the victims inbox with the mal doc Uh, the malicious document would be opened. It consisted of a malicious vba macro That would instrument the next stages of the infection chain However, what's different here and what's unique here is that the attackers now moved from using power shell based scripts To using wsf based scripts Wsf is the windows scripting file. Um, it has the ability to Have multiple Scripts if I may say or multiple scriptlets inside of the file from different languages And that's what we saw during the course of this campaign. We saw The wsf based rat, which we're calling slourat Being used to deploy additional malicious payloads on the infected endpoint And we saw the reuse of the ligolo reverse tunneling utility that we saw earlier in march of 2021 Being used again in this specific campaign as well So, um, let me give you a brief overview of what slourat is We we we named the terms slourat, but it is also known as canopy This rat was also disclosed by the cyber security and infrastructure security agency of the united states The cisa agency This is a wsf based rat. So it contains different Snippets of code consisting of vb script and javascript and the execution jumps from one snippet to the other In order to carry out the malicious functionality that resides inside of this rat um If you take a look at this specific rat, you know, if you just open it up in a text editor You will see that there's a lot of obfuscation here But if you peel through the obfuscation layers at the end of the day, it's a very simple rat It has the ability to execute arbitrary commands on the infected endpoints, you know The server will issue a command the rat will Execute that command on the infected end point. It then Stores the output of the command in a text file that text file is Subsequently read by the rat and then x will created out to the command and control server The the rat also had the capability to deploy additional malware payloads and we saw the use of you know As I said the ligolo reverse tunneling utility being used here as well By infections by infected endpoints that were running slough rat Okay, so at this point in time, uh We've been through all the different muddy water campaigns But what I'd like to highlight here is that if you look very closely at these slides You will see that there there are there are some very common ddbs, you know the use of executables across different campaigns, you know from armenia to pakistan to turkey the use of Remote control utilities from march 2021 that were reused again in december 2021 and well into january at the beginning of this year All right, so let's talk about canary tokens. So, um canary tokens The the usage of canary tokens for tracking infection chains is very novel Uh to this specific apt We haven't seen this before the first time we discovered this was when uh, we were taking a look at the campaigns You know sometime in april 2021 um And the more we looked at these campaigns the more we realized that The attackers were slowly refining this technique So in april of last year in the campaign targeting pakistani entities We saw that the attacker started using homemade infection tokens, right? This was the experimental phase for the for the for the specific technique, you know the attack The attackers were testing the waters. They were trying to make sure You know, they were trying to understand the utility of this technique Um, they used this technique up until august of 2021 Perhaps around this point in time the attackers realized that using homegrown infrastructure and homemade tokens was a little bit too noisy You know a random document, uh, you know suspicious document reaching out to a random ip on your network is, you know bound to catch somebody's attention at some point in time Uh, which is why beginning with the campaign targeting turkey in november 2021 We saw the attackers try to legitimize their infection token tracking systems They started using a more professional implementation and they started adopting canary tokens.com in order to keep track of their infections, right? now, um Canary tokens is a very interesting concept and uh during the course of our research We try to come up with a hypothesis of why the attackers were using, uh, you know, homemade infection tracking tokens and canary tokens specifically um So canary tokens can be used for a number of purposes and these are our we have four hypotheses The the most obvious one is, you know, you want to keep track of successful infections Uh, on the other hand, we believe that canary tokens can also be used for anti analysis So, you know, basically you can have an instance where, um, you send a request to the token server first and only if the token request is received will the server issue a payload to the specific, uh, infection So this is a way to thwart, uh isolated analysis of specific components of an infection chain Um, I believe that canary tokens can also be used as a form of uh timing checks for anti analysis You know, you check the duration between a token request and the request for the payload And if it's too small, then you you know, the infection is probably running on a sandbox or some kind of an automated system Uh, canary tokens can also be used to find protections. Um, you know, if a token server keeps receiving requests from an infected endpoint But there are no requests for the payload from the infected endpoint Then that means that, uh, you know, the victim has some sort of blocking and protection mechanism in place And the adversaries can then modify their strategy and change their, uh, techniques and tools In order to achieve a greater degree of success against that specific victim All right, so I know this is a lot and uh, I know I rushed through a lot of the The content because there was a lot of content But I promise that I'm just going to present two more slides to you and I'm just going to talk about the conclusions here So, um, in case there weren't enough Timeline slides for you. I have yet another one This time we've tried to do something different. Uh, on the left hand side We have a timeline of all the campaigns that we've discussed today And we've also presented and color coded all the salient TTPs, uh, along with the different campaigns, you know in the bullet points But on the right hand side, we this is our attempt at showing the different, uh Commonalities between all the different campaigns that water has conducted over the course of the last year You know, you see the use of honey tokens across different campaigns We see that the honey tokens were then evolved into an into a completely new and uh Distinct campaign that was targeting turkey We see that, uh, you know, the similar payload similar power shell based downloaders and file names and extensions and load bins were used across multiple distinct campaigns And uh, this indicates that the adversary is reusing their TTPs You know, they're reusing their TTPs and they're borrowing from one another based on the reliability of and and you know Utility of a specific technique that they've used in the past All right. So in conclusion Uh, we believe that muddy water is a super group. We believe that it is Uh, an umbrella organization that consists of separate teams that are targeting different geographies All of these teams borrow TTPs from one another Um, and we've seen that we've seen evidence of the reviews of their TTPs from one campaign to the other in distinct and different geographies However, that does not mean that the group is not that the that the individual groups are not innovating on their own as well We see that the groups are innovating on their own. They they develop their own TTPs They develop their own suit of tools. They develop their own tactics They test them out. They refine them. They make them more reliable And uh, once the utility of these techniques has been proven They will put them in a common pool of uh tactics and tools And these tools can then be utilized by another sub team in a completely different campaign targeting a completely different geography now, uh Although there's different tools and techniques being used and there's some amount of uh overlap between the TTPs as well We strongly believe that all of these teams under the muddy water umbrella Serve a common set of goals, you know They serve the interests of a common nation state, which is Iran in this case And their primary focus again is to conduct espionage to establish and maintain long-term access into their victims networks And to also do some amount of intellectual property theft as well And that brings us to the end of the presentation Uh, I think we have we have a separate session for answering questions as well. So we'll we'll take that up. I believe that's a 310 Today Welcome back everyone So here's the time for the qna and panel session for the mower Uh, the mower block of the north sec conference. Uh, first, please welcome suhara She's a senior security researcher at crowd strike She will be doing a workshop on static malware analysis Starting at four o'clock today. I think just a few minutes after the the panel And it will be continued tomorrow. So it's a two two day workshop probably super interesting And we have our presenters Everyone is here so I will be asking question from the audience. It's not too late if you do have questions. I do have the Your question will pop up here on I am my laptop open in front of me. So if If you still have questions or follow-up questions, feel free to send them to slido slido and I will be able to ask them First of all, thank you all for your very good presentations And I have a I have a few very good questions from the the audience. So I'm going to start with The ones about air gap network, this is the subject that I think We got the most question about So I I have a few questions where people were asking about, you know, all the Proof of concepts that that was presented at various Places where, you know, you can extract data with with light or with LEDs or with sound or with any kind of Other methods, so you're presented about usp. Do they do you think that they are Really a proof of concept or do you think that they really apply to the in the real world? So great question because One of the reasons also behind our research was to see how much of these techniques were in the wild and the answer is really like Not as so they're not in the wild As far as public knowledge is concerned. So it's not impossible that Attacks happen and they were either never detected or never reported on publicly That being said Those techniques don't execute out out of thin air. They there still needs to be a payload that will implement Whatever like using speakers and microphones or things like that So they're still malicious code that will implement those techniques and that code can be detected. So Um It's not it's those are cool techniques for sure. Um, but Before if if you manage an airgate network before you implement a Faraday cage around everything Make sure you tackle the usb stick problem first properly And that you manage software updates and then you can focus on on that kind of stuff And do you think that the the lack of of Us seeing it seeing it in the wild is due to the fact that we We don't have the telemetry. We don't we couldn't actually collect them It's a it's a factor But then again what what we detect in terms of like the malware that we know that uses usb We don't necessarily detect the use of usb as a covert mechanism itself We'll detect the files like the malware that will write to usb to usb drives or that will gain persistence on the system or that will I iterate on the file system and find files to to to to copy So the the physical communication mechanism that jumps the air gap is just one tiny part of the whole attack And even though it's yeah, of course, it's hard to detect like lights flashing at a certain speed It's hard to detect But there's a plenty of other things that that can be detected So I'm I think that it's not necessarily because of lack of being able to detect that that we don't see it in the wild I have a follow-up question, but I think everyone can can answer. I will give their opinion Is about how, you know, it's always more sexy to talk about the the how we break stuff, you know, the How we we're actually able to compromise something or talk about potential attacks But It seems that we get more and less media attention when it comes to new ways Around protecting against those those attacks. Do you think as an industry we should focus more on? The novel defense techniques rather than always like spending so much effort into Like red team or, you know trying to break the The security of the different systems, so I've got an opinion on that Let's hear you first Of course, it makes better headlines saying this is how I broke that instead of this is how I defended against the hypothetical attack But the problem when you show how to break something without showing how to defend against it Like who is who will benefit from that information? Who benefits from a technique to break something that cannot be defended yet? So I think it's the responsible thing is at least to give some ideas sometimes it cannot be fixed easily But at least give some some ideas of how to mitigate how to detect Otherwise, you're just giving hints for attackers to to break more stuff Hello, okay, so I agree with that and I'd also like to add that it's a question of risk assessment as well So, you know attackers and I mean even including ourselves that are not attackers. We're still kind of lazy, right? So you will try to find that low hanging fruit and that's kind of what you're going to start by going after And so yes, all these flashing new techniques are very great and stuff I don't think they have to be overlooked. I think they're worth talking about But we have to not forget that things like defense and death and Making sure that you have all your basics covered are also really worth it as well so I've had a question for you regarding your your presentation leon so Do you think it would be possible to further help the decompilation To just to see an equivalent of the javascript that was actually Used as source for the for producing the v8 snapshots Yes, certainly, and that's a great question. So what we saw was a Okay result and we were able to work with it, but it was not ideal And I do believe that we could make that a lot better Um, especially regarding like all these properties. So like when we're you know loading a library and it you know You do like dot something dot something dot something it creates like tens and lines of codes And it's like really hard to read. So I do believe that the next steps for that specific tool will be to improve the result Of the decompilation and I do think that it's possible. Yeah Thank you um I have a question for ashir so Regarding the canary tokens so What do you think? So when people make defensive tools Uh When they provide some kind of Services that can be used as canary token. Is there any Mitigations that they can use to avoid misuse by bad actors Yeah, so canary tokens is one of the tools and services that fall into the career You know, they can be used by legitimate parties such as red teamers But they can also be used by bad adversaries to serve their purposes as well I think it's very important for Service providers of these services to proactively figure out if their platform is being misused And also it is the responsibility of the community, you know Researchers like us so that we can provide the right kind of threat intelligence that prompts people and organizations to block Certain events that are being used or certain artifacts that are actually misusing them So you have to have a proactive model where they can do takedowns and they can proactively go out and find Um misuse of their platforms This is for for the service providers And then you have to couple that with the right kind of threat intelligence So that other organizations can also protect themselves, you know until of when there are a service provider takes action And I have another question. Do you think it's also a problem because Those canary tokens generate logs on someone's server And and do you think that can be used to further find identify victims and perhaps help remediation correct So using third party services like canary tokens, you know that that would leave logs which can be recovered and tracked sometimes even by the service providers it's it's You know, it's They can they can be used in any form. The problem here is that Canary tokens are not usually You know something like canary tokens is not usually by Blocked is a part of block lists So it becomes a problem for various remediation teams to figure out whether this is actually red team activity Or this is an actual adversary in your networks. That's actually doing some activity of this form All right, so thank you. We have some more general Questions as well that everyone can can answer So there was a question about if we are seeing If there's an increase in trends in malware trying to steal or generate cryptocurrencies On non-traditional targets This is something we've you've seen or So I'm not the the expert that at he said in in that field What's super common is Is iot buttonets deploying miners there for Some reason because those are not really powerful machines What's what's more What makes the most money I guess is stealing rather than mining crypto And that that's that's going on quite quite intensely right now Yeah mining requires I think A huge buttonet rather than stealing where you just have a single target that you must compromise So malware research, you know We've been doing Well, the the the field of malware research has been has been there for about 30 years now. It's quite Quite long recently we've seen new kind of A job title, but you know helping in in in in this this research called threat researchers where there's People with no or a little knowledge about reverse engineering Trying to identify malware campaigns and so on Do you think that there's A risk that Attacks get wrongly attributed or that there's missing technical information When So that if do you think that there's a trend that we are perhaps not giving the the the public the the the Factual or scientific output For sure, I think the first part is that attribution is hard So there's many many threats and some of them use common TTPs right tactics procedures And it is hard sometimes to distinguish them between themselves But I do think it's kind of like a best effort and because there's so much industry collaboration in the field I find it makes it so much easier for researchers And so it ultimately helps the public have The best information It's not perfect, but it's a best effort. I find I think The truth is in the code So whenever malware is involved, you can't fully understand what's going on without reverse engineering thoroughly the the samples We actually had a case last week. I think where Some some people put up a blog post saying they found similarity between this industrial 2 and another malware family And they used the code similarity analysis techniques And indeed there were some parts of the code in industrial 2 that was very Pretty much identical to parts of the code in that other malware family, which I forgot the name of So they were like, oh, why why is this going on? Is there a connection with with with that? The problem is that one of our reverse engineers spotted that the code similarity was in like standard libraries That are probably present in 75 percent of C code So yes, there was similarity, but the person doing the analysis didn't understand what they were looking at So that led to some very very wrong attribution And we actually called it out. I think nicely, but And they updated the blog post saying further research showed that so it's not a real connection But so it's important to pay attention to those to reversing because As I said, the truth is in the code I believe Hi there, okay I guess I want to add to that to lian and alex that would make valid points about code and You know, that's where a lot of the hooding is But also I've seen code that would trick analysts into thinking That oh this attribution I see some interesting strings like it may look like it comes from a particular group and And and this is where I think the original question was, you know, there are individuals out there that do data analysis they try to look at this They try to make a I guess with whatever existing data they're looking at to make attribution They are also those individuals also are important in a controversially speaking manner But I think the key is when you have reversers and analysts work together And that's also kind of where I come from in my field of work And I I have to always I create a lot of Right up on what I've reversed, but also I work with analysts to see What they have on the end and it's a lot of discussion from there on and I think it's it's like that It's it's a hard thing attribution, but it's a never-ending process Thank you I have another question So we we've seen we've seen it during the the invasion of Ukraine But also in in previous conflict, but I think it's it's been I think it's never been as clear during the past few months in Ukraine that malware Is used in in the context of war Do you think that malware research as a Field as an impact outside of the technical sphere So we talk about about we talk a lot about our analysis But do you think that our work actually As repercussion outside of the technical crowd? Yes But to what extent Of course it has some repercussion How how much it's hard to tell, you know, if you document A campaign or an attack that was never discovered before You're going to expose tactics. You're going to expose assets used by by the attackers Some attackers will immediately like pivot to something totally different Some attackers will keep using the same c2s the same malware just tweak it a little bit In in those cases, did did the publication really have an impact? Maybe not or it helped people defend better, I guess So it's a it's a trick question. I think we will never know how much impact Our publications will have Because those impacted our governments and they won't call us saying hey screw you you you messed up our attack or something like that Anyone else? No, okay, I'd like to hear a sheer stake on that one So I believe that the the intelligence that we generate Leads to protection mechanisms, you know existing enhancement of existing mechanisms as well as creating new mechanisms And when we stop an attack at the right time, whether it's for Critical services, you know utility or for a hospital We don't know we don't really at in certain instances We don't really know how much of the attack we actually stopped and you know how much of the damage we actually prevented And you know, that's a conundrum because you know, if you want to stop an attack as soon as possible But if we do that, you know, sometimes we don't know the scope of the entire operation And so it's very important The work that we do even if we publish disclosures where the attackers are not forced to change their TTPs I'm very sure that in one form or the other we are protecting some of our customers or the you know, the general public as a whole We have two two interesting questions from the audience The first one is regarding the trust of the security vendors. So the security products to install How do you address the issue of trusting that vendor and the possibility that it may be compromised or Oversight by their respective government That's a political question to be honest Not really a technical question I'll I'll just say that honestly, this is out of my field of qualifications and I'm not qualified for Sorry captain kangaroo Um The other one was Yes, the other one was Whether um, we've seen an increase in in mobile malware And versus desktop malware or versus what desktop malware regular Stuff we see I've got another one the the He said it's it's for the the Noobs, but I don't think it's it's it's I think it's a good question is is about how attribution works. How how do we decide to publish an assessment on the the that this attack was performed by a particular group um, and if we have thresholds for Thinking we have enough evidence or clues that It's uh, it's them It's a difficult question actually I can try Um, so I guess it's how do you assess how do you? Uh, it's a hard one and every time I have to write an intel report on a piece of malware I am constantly reviewed and critiqued on my assessment Uh, but a lot of the criteria is it's I guess it's easier for those as reverse engineers because you Look at code and that's a bit more factual And I can especially if you see a particular technique doing something and it then it can then it's A high probability a high confidence that that proves that thing But you cannot but when it comes to answering questions like do you know or can you foresee? How this piece of malware of this attacker will plan the next move? That's difficult and that's not an assessment that I Like to write and I try to avoid that so It depends on the individual so when I do an assessment, it's very technical And that's how I like to keep it I agree with that and I'd also like to add that sometimes we do let get lucky um, so the Opsec of the threat actors is not always on point and it happens where we do get lucky and we kind of Gain extra insights into who these people's are who these people are or um, how are they Doing things and why and it's purely out of luck. So this does happen And there's no one recipe to attribute an attack to to a threat actor if there if there's one Well, we're not aware of it And there's it's it's about comparing different indicators. Some are really strong Or stronger some are weaker, but they still matter There are technical indicators for example the reuse of of an ip address that is not like a shared hosting provider, for example That's a fairly good indicator. It's not foolproof, but it's a good indicator The the malware itself have we seen that malware before is that that malware was attributed previously? by the industry as being operated by a single entity that is labeled as You know apt something And then you've got the targeting you've got other software ttps Use of like spearfishing or whatever like that are all put together You can make an assessment that everything points to a campaign that was launched by this this known actor But again, it's it's an assessment. It's not a it's rarely a 100 determination Do you want to add anything this year? So I agree with everyone on the panel Attribution is hard and it tends to be very subjective at times It depends on the amount of information you have, you know the output of your reversing exercises as when you're as your Institutional knowledge and looking at open source intelligence and code similarities and stuff like that What I also want to highlight here is that It's okay if you can't attribute stuff It's it's that's that's not a problem You know we can convey doubt in an effective manner and that's completely okay and justified You know, I'd rather convey doubt more effectively rather than come up with the wrong attribution any day of the week All right. Thank you very much And thank thanks to everyone for your presentation. Good luck sue for your workshop And this will conclude the q&a and panel for for the malware section Uh, I think we'll be back. I think it is a 15 minute break if I'm not mistaken or perhaps even a little more a little bit more But we will be back with the next section after Thanks everyone for for being here. Uh, thanks for being online for those who are listening live streaming And I'll push you a good, uh, rest of north sec. Thank you everyone so Welcome everybody. So this is the detection engineering slash blue team block aka the best block Um quick reminder. This is going to be three talks of 30 minutes No questions after the talk, but we have a q&a of 30 minutes after you can ask your questions on slido so The QR code is going to be get shown on the the screen It's on the discord as well. Make sure to ask your questions So the first talk is by kaskian kill kelly He's a senior consultant with crowd strike canadian services team kaskian's career highlights includes a variety Of roles over 20 years that usually end in the word security or the word consultant He has worked with different companies across north america and europe security conferences You've probably never heard of So kaskian the stage is yours. Thank you. Thanks I like that variety thing. It's uh, kind of what i'm going to be talking about today So my name is kaspian. You've all heard that. Uh, i'm going to talk about 10 things I wish I knew before my first incident I'm not the only one, uh, who wrote this talk. Uh, so my manager at crowd strike shelly geesprick who can't be here today Would have been doing this talk And I think i'm looking forward to seeing her version of it because it's probably going to be pretty cool We're going to go over a story or two. Uh, and then we're going to get into my top 10 list and I hope you'll enjoy it I'm just going to Proface this I don't know by saying Nothing i'm talking about in here is not public information. So it's already stuff that's gone out I don't speak on behalf of my employers while i'm here. I'm speaking from personal experience I've been with crowd strike for a while. I've been doing this for 20 years though. So That's kind of the whole thing. So let's start with my first incident. Um, I was pretty young It was a long time ago ransomware wasn't a thing yet and I was kind of a junior incident responder really like getting into my stuff getting very excited about things And we had a malware case kind of like what we've been talking about for the last couple of hours Except much less advanced than anything. I've been talking about for the last couple hours Um, we hit the big red panic button a bunch of people go running into this executive's office We're freaking out. We're tearing apart computers. We're pulling stuff out We're like getting ready to like do dead disc work And Over the space of probably oh, I don't know a full day Panic ensues lots of panic people are running around waving their hands. Oh my god. It might be a state motivated threat actor They might be attacking us Turns out it was actually just an email campaign um, and the only reason We were freaking out was because the email campaign deployed something on this one executive's desktop And the poor executive went. Yeah, let me just click on that and open it That ended badly. It ended in a huge waste of resources It ended us scrambling pretty much all the jets And the search dogs and the fire department For nothing it also kind of did some bad stuff for our security team because we Got looked at as that the people who freak out over nothing, which is not a great situation so There's some commonalities here. You're seeing them up on the screen, of course But one of the things that I like kind of pointing out is that There are usually three things you need for doing Incident response. Well, they're up there. I'll let you read them I'm going to talk about a second story where and this was a little bit later in my career Um at a hospital, uh, we had something worse happen. This time it was real So ransomware hits a hospital This happens. Unfortunately. It's a it's a thing this time It's not so bad that it's like making the hospital not work, but it's bad enough that it's messing everything up for us And we hit the big red button and everybody panics and everybody runs and thankfully We had the ability to call out for help. So we got some extra people in but it took us a little while to get to that point Um, we also didn't have any way of talking to any one of the various divisions of the it team or anything else So, you know, it took us a while to get to the point where we were recovering things So by the time we finished the recovery, you know about 16 million years later, uh, actually it was close to six months The threat comes back and wax us again Yeah, that was great um I think the main thing here basically is plans communication and visibility obviously the visibility part is actually My number 10 on this list. It's the first stop Can you even detect threat actor activity? Uh, and a lot of people are going to be like, yeah, cool We've got tooling. We've got automation. We've got all this cool stuff. And that's great It's very important. I work for a company that sells automation and tooling I worked for a company prior to this that also did that. I've spent a lot of my life looking at other people's work around it and honestly Everybody's doing a good job in this space I'll obviously say that, you know, my employers are better than everyone else because I work there But I've also actually gotten to work with their stuff a lot But The main thing here for me is actually people Um, you can have all the shiny blinky lights in the world You can have all the machines that go ping it really doesn't matter if you don't have people who are capable of handling those things So, you know, there are two questions on this slide Obviously, uh, and these were ones that I would keep asking every time I change jobs every time I got a new role as an incident responder What what's the security stack look like and it's gone from Norton antivirus, which should give you an idea of how long I've been doing this To advanced endpoint detection and threat resolution blah blah blah intelligence sprinkle some blockchain on there Whatever you want Um, and and the thing with this is that That stuff's great But sometimes your solutions bail or twine and duct tape or just grab so Knowing that you can detect stuff is great Knowing what happened is even better and a lot harder and and the three things here on this slide These are kind of, uh, you know an ideal situation I've never worked with a team either in consulting or in you know in practice when I was doing it where We have all three of these lined up perfectly Usually what we've got is we've got a situation where we've got some good investigators, but The logs aren't being stored anywhere or the logs are only being preserved for a week because they're too many of them or On and on and on In a lot of these situations, you know your your outcome from your your actual incident response Is probably not going to be as good as you want it to it's not going to be one of those things where Hey, you know, this is great We resolved everything but let me talk about one where it did work for me And this was a little while ago It was another minor ransomware case and this time basically I walked in the door I think I was working alone at this point and I said, you know Where your logs and the IT person comes back and he says, oh, they're right over there All 40 gigs of them go nuts So several hours later, uh, I had started to be able to reconstruct how the threat actor got in what they did Everything else. This is you know, not not a lot of systems. We're talking about it's a fairly limited scope um, and what was kind of nice about it was We actually had a resolution within a week, you know, it wasn't an incident that just dragged on and on and on We actually knew what we could do So that's a good situation um, I've been in a lot of bad situations and usually it's because Logs are missing or they're not stored long enough or people aren't trained on their equipment I think I'm repeating myself, but I think you get the point Um, I've also worked with a lot of grep in the past. That's why this is up here This is actually me, uh, 25 years ago when I got my start in incident response So the next question we've gone with Can we see the ta? Can we investigate? This is a bigger piece for me. What can I contain? um, I love this picture and that's why I put it up here, but I There are a couple of main things about containment. Uh containment is should be your first move on an incident but in a lot of cases your first move is Once you get past this part go to containment Look at what you can contain and you should know what you can contain, but a lot of people don't because you know, um There's more than just the incident response team working in it so you're any going to end up in situations where Your containment isn't working because you're missing pieces. There's missing communication That kind of goes back up to my top slide, right? I've worked with teams that really know their stuff and they've got good processes and there's been tooling and investigation And this is also doing the consulting side of things Um, but for me the containment part is usually where people trip up. We think we've succeeded. We're not sure Can we call you and get some more information? And I've been the person asking those questions And for me, it's kind of perpetually a case of like How sure are we? Uh, how far can we see? So let's go back to slides, you know, who who's actually able to help us here? Which is another piece so I think with ransomware, it's actually really simple to talk about containment. We're basically saying, okay Stop the spread cool Great, we've done that. Uh, what about the threat actor and the c2 networks and the rats and the data? Well, you can't prove the data was exfiltrated unless you can see what's going on on the network And if you can't see what's going on on the network, let me back up a slide. No, i'm kidding So the next piece after the containment side and actually in some cases at the very beginning of the containment side is Who do I call for help? I where's your team of avalanche search docs coming from? Um, you know, are you able to call? Across the room to the it engineering department and say, okay, can you like, you know close all this stuff up? I've got a really interesting story about that one actually. Uh I was working at a place where we had a really really large network and What basically happened one friday, of course at around five o'clock when I was on my way out to a conference Was the entire network went offline not just Not just the internal one, but our connection out and literally everything else. We basically DDoSed herself from the inside of the network And network engineering lead comes over to me and says you're never going to guess what just happened And i'm like, well, I think I am because you're going to tell me and i'm going to bet it was a worm. Yeah, it was So the problem with this was we had no way of reaching the side of the Large campus that I was working on to talk to the people who had the worm to tell them to stop For three hours. So our entire network was offline for three hours because we didn't have a phone number The other piece of this and this kind of turned into something later because the reason this DDoS happened was We had a piece of technology that had really really really really really poor internal security um I don't know if any of you have ever run across a username of developer and a password of You can guess So This is where the functions that end in r come in. Um, in this case hr wasn't involved pr thankfully wasn't involved, but lawyer was involved And then there was a third one that we didn't have to call up our fourth one. Sorry external responder um Knowing when to escalate is kind of important because getting those people those r's on board Is sometimes going to require a call at three o'clock in the morning and sometimes it's going to require that they actually Get over there in time. Uh when I've worked for on-site incident response teams, which I've done a little bit We usually try and have a 24-hour readiness time if you're in the middle of a ransomware case. That's That's a lot of time, you know So be prepared to get extra help know who to call for extra help And know who to escalate and if we you know go back to my original sort of not exactly an incident incident The escalation went to me, but I was the only incident responder there I was the junior incident responder and that's all I did Um, and it was coming from an executive who probably should have been the person who actually had the power to do that But didn't really know how that worked and I think this is probably one key point to take away from this is You need sponsorship from somebody further up in the organization Even if you're at the top of the organization, you should probably have that backing So the next piece in my top 10 list is criticality This is one of my favorites because I do a lot of strategic practice work as well as incident response Uh, and I help run red team blue teams and you know do tabletop exercises with people So we spend a lot of time Doing simulations and sitting folks in a room not as many folks as are here right now but sometimes we get quite a lot and We usually will have a question of like, okay, so where are your critical systems? And my favorite thing in the world and by favorite I mean, I absolutely hate it and it's kind of terrible is the number of blank faces I see when I ask that because there's always a Oh yeah, I mean, I guess I guess the point of sale machines are critical or Maybe it's that that electronic health records that that's probably important or or maybe our sql data and then it balloons And this is the thing with what critical is Uh It's obviously get a very poor organization. You know if you're if you're in You know if you're running a store for example, you're doing sales It's going to be a very different thing from if you're a hospital or a factory that makes widgets It's also going to vary per team. Everyone thinks what they're doing is important You don't want to really take that away from them obviously But you do have to have a hard conversation about who gets budgets and who's responsible for patching And who's going to be there when the lights go out and who you're going to call to restore everything because As an incident responder, you can't do everything and usually this is why we at you know We call it a team but the team kind of extends The other piece of this is who knows what and uh, and who knows what they own And this is my favorite. This is my favorite sort of like dig down when we're doing uh tabletop exercises The worst thing in the world in an incident is to not know who owns the system you're working with I actually had a case years ago where We uh, we had a it was another malware case. So it gives you an idea how long it was because you know Malware now is solved by everything, but it was a malware on a windows xp machine Malware on a windows xp machine and a part of one of the buildings I worked in that no one knew even existed until the malware pinged Our antivirus and went hi. I'm here Calling out to russia Come get me And we're like, okay. Where is this? Where is the system? Turns out it was in the basement and it was attached to our hVAC system And when we got down there We start tearing it apart. We're like, okay. First of all, no one knew this was here. Who's is it? You know, can you tell me what it does? It took us two days to figure out it was connected to the hVAC system That was another one that kind of turned into an emergency for us because it was around the time of stux net And there were questions being asked and thankfully again, it didn't blow up too big we didn't have to do the whole you know panic arms waving thing, but We did have to spend a lot of time doing a very long and deep investigation after that just to make sure everything else was okay because We also discovered that the hVAC system was controlled by a whole bunch of plcs and workstations and servers That we hadn't seen yet and weren't on our radar Is this critical? I don't know is an hVAC system critical for a building. I think so Um, I ran a simulation a little while ago with one of our hospital clients that I kind of love because we actually set it up So that we just turned off refrigeration for them Turn off refrigeration in the middle of a covid vaccine campaign You have a huge problem on your hands. So again going back to knowing what's critical and who owns it That's going to extend outside the it team sometimes The next piece is how do we customer and I I like saying it this way because it's also how do I even? Or how do I cope? um This isn't a self-help seminar So i'm not going to get into the individual coping skills that you have to have as an incident responder Which i'm sure all of you have probably struggled with if you're an incident responder And probably if you do security at all you've had to struggle with this This part's about the organization You can see what's on the slide obviously But let me talk about something that is in the picture on the slide. This is a picture of woodstock This is the nice picture of what was up the field from what was down the field at the end of woodstock And all the way through woodstock They had to hire helicopters get musicians in and out because they were so narrowly focused on the idea Let's do this show, but they sort of forgot to install porta potties and people got sick And they couldn't get ambulances in Woodstock was actually worse than the fire festival in some senses just in terms of the emergency preparedness and kind of dealing with All of this stuff so I you're probably asking yourself caspian. What's your point? Which is a really funny thing to ask yourself Because your name is not caspian, but my point is really simple Be prepared for denials of service be prepared for things shutting down be prepared without ban communications Going back to again my first non-incident story and actually the second one as well that out of ban communication piece was huge If we had had it it would have been very simple if we thought all our networks were compromised Cool. Let's switch to signal and talk or something else Right now though, you know We're kind of having a hard time here because we can't talk. We can't speak. We can't see we don't know what's going on so This brings me to the next piece. We also don't know who does what I kind of talked about this in the criticality piece Not knowing who owns what's going to slow you down I mentioned that with the story about the windows XP box in the basement Not knowing who does what's going to hurt you even more if I go in in an incident and turn off a critical system Or I just turn off a system or I trip over a power supply that knocks out the billing system for somebody We need to know Who's going to call the people at the end of the billing system? We need to know who's going to turn the power supply back on I need to know all this other stuff Incidents don't occur our current a vacuum so as much as an incident responder my focus is going to be put out the fire You know stop the bleeding fix all the stuff There's all this other stuff going on where we're going to need to actually Call these people in and get them to work. So, you know the 4r as I mentioned before Your your lawyers your hr your pr those folks I a lot of good Prepared organizations I work with and I'm saying good and prepared sort of separately because there are a lot of Good organizations that are unprepared The ones that are prepared usually actually have a sense of who does what in this case, you know We do check who the pseudo reports go to That's actually kind of important knowing about that Putting that down somewhere in a way that it's accessible for your responders for me if I'm your responder It's going to make a huge difference Because if I don't know then who do I talk to? So the next piece Speaking of talking If I can actually get it to advance There we go Next piece is connections you'd think I would have put this at the top, but I'm kind of going in reverse order I don't know if anyone's kind of noticed that Um connections are really important for a couple of reasons The first piece on this slide is actually the one that you probably should be thinking of before the incident Have you scanned externally? Do you know how everyone's getting in? Do you have a way of getting in to that data center or anywhere else remotely? And if you don't Are you prepared to drive for two hours to Laval? Or Toronto? That's going to take longer than two hours if you're driving from here Access is a really big deal Loss of access especially when everything goes out also a really big deal Um, I don't have a good story about this because I can't talk about some of the stuff that I've seen except to Give a vague indication that that two hour drive was something that one of my clients did experience a few times And they had to basically say Yeah, we need to figure out a way to get in Well, and I'm I'm merging a whole bunch of people in this case So they build this bastion host and it's really nice, but it's getting scanned constantly And they haven't put any kind of anything on it to prevent it from actually being used for remote desktop access One of my favorite threat threat actors really loves using remote desktop to just sling stuff all over the place Guess what happened It was a lot of fun to solve that one Uh, we had this really cool new tool that had just come out that we could actually map the remote desktop sessions out from So we could see exactly what they were doing I sort of chuckled when we saw this person just basically download metasploit from rapid seven's site and start using that Uh, this piece, I'm sure you've all had to deal with at some point. Um, I have extensively This is actually where and this is one of my favorite parts as well. This this is criticality They go hand in hand. They're like They're like drinking too much in a hangover, you know, they're they're just going to be there all the time So First of all Please please save me next time and test your backups I've been in so many situations myself and with other people where Those backups weren't properly tested or the restore process wasn't properly tested. That's an even bigger one I mean if you're restoring over a t1 line and you've got terabytes of data That's not going to work out. Well if that a s400 That you don't have a backup for that drives all of your production systems goes down You may want to have a backup system for it. Um backups are a big deal backups are a cost center Backups are also what allow you to get out of the ransomware incident quickly because That's usually what's going to happen. We're you know, we may find you to cryptor. We may find a key. That's cool But man, if you can restore from backup quickly enough, you're going to be fine The rest of it's going to be an investigation So this brings me to number 10 On my list I've been hinting at it all day Do you have a plan? Uh, I work with lots of people who don't they didn't write it down their plans A nice iso document that has a phone number in it. Um, and that's it So there are good plans and bad plans the plan you wrote for the auditors is a bad plan The good plan is not the end of my slide deck It's all these little boxes here and let me let me let me walk you through this just in case you got tired of being rick rolled Um The never gonna is your actual master incident response plan That's the thing that has all the phone numbers that you have tearaways from you give them to people and say, okay You go it engineering guy help me out here The next piece playbooks Playbooks are kind of useful when you're doing this all the time And you're kind of panicking your way through something because you've actually got something written down that you can go back to And say, okay, we've done this before Let's see how it works or we've planned this out and tested it Did I mention testing testing is really important? Um, and then the last piece the awesome piece is the fact that those playbooks and the plan and everything else goes Right back into a continuous improvement process that you can use To build better it and build better incident responders So that's it. Um, I'm just going to do this one more time. I've never rick rolled a room full of people before Thank you all very much I hope you hope you enjoyed that and obviously questions will be later. So I'll be back for that All right, it's time for the next talk Which is about javascript javascript Obfuscation classification with machine learning Those are scary words by Yuri arbitman As a data scientist in imperva Yuri develops machine learning solutions for various cyber security projects He's fascinated by the wonders that data science and machine learnings bring to the world and the wealth of open source frameworks Enabling us to build systems today at a scale and he's unthinkable just several years ago In the last 20 plus years Yuri has been working in the high tech industry in israel Including for several great companies in engineering management and research positions Yuri holds a master's degree in computer science from the weaselman institute in israel Round of applause for Yuri Thanks. Thanks, Amelia. Hi everyone. I'm excited to see all of you face to face And those watching the live stream. Thanks for joining My name is Yuri. I'm a data scientist in a threat research group in imperva Imperva is a cyber security company a leading cyber security company That's been around for about 20 years focusing on data and application security We provide our customers the wealth of solutions like well application firewall DDoS protection client side protection account takeover prevention and more Today, I'm going to tell you about one of our recent projects that combines research in security and machine learning And this project is about classifying pieces of code into obfuscated and clear text ones So let's start So the motivation is clearly to help our customers to prevent attacks And one of ways to do so is to concentrate on a client side perspective so A typical website has many many resources Most of them are totally harmless But there could be several malicious ones if attackers managed to inject them into the website And the question is how how we find them and how we distinguish between those So since a website has many many resources This is something that clearly cannot be done done manually And more over even semi semi automatically is not good enough We need a totally automatic automated process to do that So let's take a simple example of of of resources and and and see how how it looks So here We have a piece of code in javascript Which is actually a keylogger So you can see i'm not sure if you can see my laser But anyway here above you can see the command and control endpoint, which is example.com You can see that we are listening to key down events And We're sending the data to the command control Back every every 10 seconds And now let's take a look at another piece of code So it looks really similar. You can see that the command control endpoint here is missing And you can see that we listen into the same event And we just log in it Locally once every every hour So I Just highlighted the difference here for you And as you can see For this very very simple example almost trivial example the differences are very subtle So it requires really TDs work To to even spot them and attention to the tiniest of details As I said in imperva we are interested in this problem and and an example is our client side protection product Which enables our customers to look on the resources on the website And and see which which ones are potentially harmful and we provide Several scores and several data for those resources so the customers can really understand what's going on And and as you saw in this really trivial example This this problem is really really difficult for humans And the question is whether it is solvable at all for machines and and by machines. I mean of course in machine learning sense So Just one more word about this so before this project The way we tackled it was to use the best data we have in a company and to calculate reputation For ips and domains and combine all this data in rule-based method In rule-based techniques and and that's how we we provided this this course So now back to our story when we see You know a difficult problem What do we do? Well, we search for an easier one and and this is of course for dramatization purposes only for this presentation So an easier problem with which i'll just explain in a second why it is at all related is obfuscation so for those of you who never heard of it before obfuscation is Is an interesting thing The theoretical foundations of it go deep in computer science and computational complexity But if you are down to earth a bit is just a family of algorithms of techniques that make transformation on your code They take your code and then transform it so it preserves the functionality and hopefully the right time But it really hides the inner structure in a way that people find it very difficult to understand And it makes it almost unintelligible To to humans to to really understand what's going on. So let's let's take an example So this is like the most trivial piece of of code in javascript. You could think of it's just You know printing hello world to the console log And when I take this piece of code and obfuscated using one of the readily available tools in the internet We end up with with all this and actually even All this too So you may say well, this is ridiculous. This is probably some contrived example But but but you are wrong And and the reason is that I I just used a tool with the lowest obfuscation possible In this specific tool, which is obfuscator IO and if I would use This tool with the highest level of obfuscation We would end up with many many pages that I would have to show you here So this is this is a real world example and this is really how obfuscation work Just for for this trivial piece of code imagine what happens when your code really does something interesting So And of course the question is how is it all this is related to maliciousness because remember we started with maliciousness So the answer is that it depends on the language so The the thing with With obfuscation on our client side for example in javascript Is that sometimes it can be used for legitimate purposes for example for preserving intellectual property If you want to to hide your code because you invested a lot of time developing it and there is nothing malicious there And there are a few more a few additional reasons but On the client on the server side These two two problems are really closely related because usually you don't have to protect your code from hackers On the server side if you don't you know move it anywhere and do not give it to anyone And still if we look on these two domains together client side and server side Still obfuscation is a very very interesting signal that usually helps us to determine maliciousness of the of this piece of code Okay, so So what we want to do from this point on is to classify every javascript document into clear text or obfuscated Um, and the question is is it easy for humans? So let's look on another example Um, I hope you can see I hope it's not too small. I tried to make it as large as possible Take a few moments to try to understand what what it does if you if you will And then of course we have another piece Um Look pretty similar aren't they? Okay, so without, you know, torturing you too much This is a a part of Of a malicious script that I took Um, and and this is indeed obfuscated And and this part the first one Is is actually a clear text. This is a part of a script from from youtube that is Used to speed up The if you embed a piece of youtube in your on your website So this is the code that just just a small part of this code that uses to speed it up and and this is probably Calculate some hash function or things like that So, um, as you saw it's not really, uh, it's not really easy to to distinguish between the two cases for humans And actually it turns out that it might take a lot of time for seasoned security researcher and the reason is That in order to Decide if the code is obfuscated or clear text what what this researcher does usually is Really trying to understand what the code does. It's not just trying to see I mean It's not written anywhere, you know, it's obfuscated or not We need to understand what the code really does and sometimes it takes time So, uh, it's clearly something that is unscalable in in any in any way um So the question is uh, is it Easy for machines and of course we didn't jump into the machine learning solution right away We we try the working with heuristics and it turns out that heuristics are not good enough simply because They are too specific and we will talk about the various obfuscators in a few moments and you might then understand why um So The mission from this point on is to build a machine learning classifier for this problem So i'll tell you about several approaches from the literature we could use Um, so the first approach is a classical one There is a paper by uh telenbach and um and several additional researchers from 2016 Where they propose to do a classical feature engineering meaning to extract Things like average length of line frequency of specific words specific characters and then Build a decision tree. So I guess you've heard about these kind of trees And here is a simple example So This this tree is checking whether the average length of line is larger than some threshold And if it is it then checks the frequency of certain character And if and if it exceeds the threshold, for example, it says that the script is obfuscated Of course, this is really simple example, but just to get you understand the the principle um another approach was suggested by um skolka and his friends in 2019 they suggested to build a deep learning classifier based on a convolutional neural networks and abstract syntax trees so convolutional neural network just in a sentence is Is a method used a lot in vision But today it also used in additional fields in in deep learning and the abstract syntax trees is you can Easily download a tool that takes your script the java script in this in our case And the component builds out of it the syntax tree for For the script where you can see the structure and you can extract many interesting features out of this tree And then these features are fed into the network of these guys And the last possible approach is natural languages processing So in the recent years we we see a really tremendous advances in this in this field. We we see Amazing models that understand human language and enable us to to solve a multitude of tasks for example question answering and and Text classification and many more so So in our case what we can do with it. We can take a model burt is just an example To use the weights That the model was trained on and then we train it on downstream tasks may and namely our task We just feed it documents Which are java scripts classified into clear text or obfuscated ahead of time And then the model learns it and adjust the weights That it had learned previously Um So our first approach Was uh inspired by the by the first work. We started simple We wanted to make to build a simple model and we want to benefit from decision trees that enable us to Gain explainability it can uh when we use a decision tree It's very easy to see really which features are affecting the model and how Um, so we took something about 40 features And we trained on a single obfuscator And it turns out that this model didn't generalize um, so What went wrong? In order to understand that we have to dive a bit deeper into the various obfuscators for javascript So here I prepared a list of the most used obfuscators for javascript Sorted by the popularity metric of github, which is the number of forks as you can see Uglify js and obfuscator i are the most used ones and there are additional ones Here um, so What methods do these obfuscators employ? Let's see a few examples So the first thing people immediately think of when they think of obfuscation Is renaming of variables and functions and indeed if you think of your code if you rename it a bit sometimes it makes it clearly unintelligible Uh, even before it might be not really Easy to understand so in this simple example, you can see we just rename a few things and it's already looks a bit more intimidating Additional technique is modification of functions Uh function calls the function arguments and the return values So in this simple example, I just took a function that just squares the number And I show you a snippet of the code. It's not the whole code that was uh result That was produced up after obfuscating it But you can clearly see that The function gets many parameters and returns many things and it's really complicated things a lot Additional key example is a modification of strings By using encoding encryption and string generators So in this example, we have a simple variable and uh, if we want to to find the strings that were used inside of it Like it's you know, the honda accord car You can see that the strings are splitted and then several Arithmetic operations are used and if you look on the resulting code without having seen the Clear text one ahead of time. It's really takes time to understand what's what happens there And and and the last example in this context is the manipulation of constants So if we have a simple constant, we might end up with some expression Which we need to calculate to to understand what the constant is and there are additional methods like changing the base of integers And injecting dead or redundant code, which of course, uh, you know, it's clear what it does complicated things a lot So what are the differences between those those obfuscators and in general between the obfuscators So the differences are as you might imagine specifically naming methods The the the encoding encryption functions that that I used and some more so more over Uh Most of these obfuscators can receive a lot of parameters to tweak them and the resulting The resulting Docus the resulting code Looks much looks very different if you run a single obfuscator in mode one versus mode two. So Just an example for for javascript obfuscator. It has 40 parameters and the aglify js has 30 parameters. So Uh, as you might imagine When we compare between different obfuscators that clearly the the output distribution looks different and in in If we think about distributions so We get different Everything that you that you like average length of line length award Function size a proportion of encoded characters And in general this is just an example if we want to Measure the normalized backslash count Which is one of the features that was proposed in the paper that I mentioned before Or average length of line you can see that the differences between the obfuscators can be really big And then this is log scale by the way So before uh telling you about the approach that really worked for us I just say a few words about the data we train the model on Uh, so we use the public data set Of about 150 000 javascripts all of them are clear text And we applied four different obfuscators of them and and four because we wanted to uh test the model on additional three That the that the model has previously not seen and we um We got perfectly balanced data set of about 100 000 clear tech javascript and 100 000 obfuscated javascript Equally divided between the Between the obfuscators so um our approach combined Approaches number one and three that I mentioned And and and the idea is as follows we tokenize the input into words And we calculate the most common words in the clear text javascript So we take all the clear text javascripts We extract the common words out of it and look on several hundreds of them And then for every input we measure the difference between this input and this uh, and this calculated distribution So let me give you a small example So assume that the top three words in javascript are function document and input and assume that Like function appears about once in every 20 words Document once in every like 33 words and input like once in every 100 words So if we have following javascripts and and if it's too small, I'm sorry, but you don't have to see the exact details I will just read them out for you um So here we have 71 words And uh, if we look on the word function, it appears twice Meaning it appears in three percent of the cases the word document appears once and the word input appears And never appears So if we calculate the difference between the clearest occurrence We calculate it as I showed you in the previous slide and the actual occurrence. We can see the differences in red So, um What do we do with these differences? So we we feed them into a boosted decision tree and this is very similar to the example that I gave you before Uh, and then we just look on the differences between Uh, the specific word how how many times it appears and then we can know whether the Text is obfuscated or clear text. So I hope this this this is clear This is just an example of decision tree with the different features. Every feature corresponds to a top word We extracted from the clear text javascripts So it turned out that the performance of our model is is pretty good We were mainly interested in this case in false negatives and false positives You can see that both of them are less than one percent About that this was the Product requirement and specifically, uh, we wanted So what's false negative in this case false negative means that we are classifying some document as Clear text when in reality it is obfuscated and false positive means we calculate we classify document as obfuscated We're in reality. It's clear text. So, uh, we were interested in in a case where false negative rate Is smaller than the false positive rate and the meaning is Since remember, uh, as you as you recall as I said, it's related to maliciousness. So we do not want to flag Cases where uh, people would look and say oh in reality it's not obfuscated So we don't want to um, you know to increase the number of these cases Um And the next question we asked whether Our approach is generalizable to additional languages and it turns out that that the answer is yes So we looked on python and php. For example, as I mentioned before they are closer to So the case for these languages since our server side is that obfuscation and maliciousness are closely related to one another and um And this is why in this in this case is it's even more interesting And and we got really good results. Of course, we trained these models on on the specific data set that were chosen for these languages And so let me show you just a very very short demo. This is the qr code you can scan So we set up a website That you can use it's publicly available Uh with some version of our model And you can just play with it and see And see what's uh, what's what's what's going on? So if we take This script as example, you can see it's uh, non obfuscated Okay, so we can just take it We can feed it into our model. This is by the way A live demo so You can see that the model says that this is a clear text with probably probability 0.95 Meaning the model is pretty certain And here you can see a sharp values Uh, I don't have time to explain what what they really are They are just the significance of features that affected the decision So in this case you can see uh an example of one gram model Which is a little bit different of what I described before it's one of of our models that we Built during the development of this project and If we take another example just just a quick one So I took I took A piece of code And obfuscated it so you can see that I'm not sure you can see unfortunately the window is a bit small here, but You have to trust me on that So if we feed the code here Then um, it says that it's obfuscated. Of course the probability is rounded And you can also see the various features and how they affected the decision. So I I cannot promise that this This website will will remain in this in this exact form But if people are, uh, you know interested and and will use it we might enhance it and we might Continue maintaining it So What are uh, what what have we learned in this in in this? project So the first thing is uh about the relationship between maliciousness and obfuscation in different languages As I mentioned, it's not a one-to-one relation But usually obfuscation is a strong signal for maliciousness and It's really cool because I think that solving the obfuscation Classification problem is much easier than solving the the maliciousness one in terms of machine learning, of course So we saw that Classifying code we saw many problems that are difficult for humans Uh, but even the even the problem which which mine which might seem simple of classifying the code into obfuscated or clear text is not is not so easy and And as we saw a building a machine learning model that solves it is relatively not very sophisticated and not too hard And another thing is about the obfuscator. So we had Several choices here building a machine learning model per obfuscator is not scalable And looking on the internals of each obfuscator is not scalable. So we use this You can call it a trick or approach of looking on the clear text Extracting the top words out of them, which is inspired by natural languages processing Uh approach and um and and this this is really enabled us to solve this problem And another nice thing is that this framework seems general So we didn't test it on all languages, but I think python php and javascript are like the more the most interesting Interesting one in terms of obfuscation And there is no reason to believe that this wouldn't work for additional languages Of course if you train The model on a large enough data set that is properly balanced So thank you so much. I uh, I'm looking forward to your questions. Hope you enjoy the talk So it's time for the last talk of the day by Maya I forget how to do I just go to pronounce your name kacherosky And erichang So maya is a product manager at tail scale providing secure networking for the long tail She was previously at github in software supply chain security And before that at google working on the container security and encryption key management Prior to google maya worked at mckinsey and studied mathematics at megal university Eric is an engineer at google's enterprise infrastructure protection team where he builds system to scale internal security processes He has previously worked on linux fleet security at google upstream communities cubanities and cloud authentication systems Thank you. Is your mic on? My mic is not here me. We can great. We're here to talk about the road to beyondcorp and how to get to beyondcorp So as we just introduced i'm maya. I am a product manager at tail scale Tail scale is a wire guard based mesh vpn. And yes, I do have stickers lots of stickers. Lots of stickers. Yeah And i'm i'm erich i'm a security engineer on the enterprise infrastructure protection team at google We deal with google's corporate security, which Relatively relevant to what we're about to talk about today. Yeah, exactly So what are we going to talk about today? First we'll cover what we mean when we say both the words beyondcorp and zero trust and why those are important and particularly poignant right now Then we'll dig into what beyondcorp is to better understand Its components and how and what what is needed for that to function We'll talk about what companies have typically done or tried to try to do to adopt controls like beyondcorp And what steps you can take towards a zero trust architecture And we'll wrap up with going over the long tail of problems We're going to hit by trying to make something truly zero trust. So to jump in Zero trust zero trust is a very trendy phrase these days But in addition to coming from you know, lots of vendors and annals firms and yes conference talks It really does seem to like the concept of zero trust has become particularly trendy and mainstream like very recently And that's probably due to what just happened south of the border In late january 2022 the u.s. Government published a memorandum on zero trust principles And about moving the u.s. Government's institutions and networks towards zero trust And this was kind of surprising like although maybe not that surprising because after this was after in may 2021 The u.s. Government also published the executive order 14 028 for improving the nation's cyber security and this is a follow-on to that So as a result of this memorandum By now u.s. Federal agencies have adopted their zero trust plans Into their cyber plans and they have to Implement what they've decided they're going to do by the end of 2024 So I guess I guess this is just how we do security trends now So how are zero trust and beyond corp related? Let's start with traditional networks in a traditional network You're in the traditional network architecture you you rely on network perimeter to delineate between trusted and untrusted users such as trusted employees inside of firewall versus potentially Untrusted employees outside of or untrusted actors outside of it By moving to a zero trust architecture the location of an individual specifically what network they're on is no longer solely What determines whether that individual is trusted? But other context is use to determine whether or not they can access a given application So with the zero trust architecture, there's no longer such a thing as a privileged physical corporate network One thing that zero trust doesn't mean though and people get confused about this is that vpns are bad And I have to say that despite you know the fact that I that I work at a vpn company There's nothing inherently wrong with vpn But there is something wrong with assuming that because someone's on the vpn they're trusted or you know that they're on the corporate network They're trusted So if it's more like you know vpns that have no application level, you know access controls are bad sure Beyond corp Introduced in a paper in 2014 is google's specific implementation From which the broader zero trust principles emerge and zero trust architecture emerges yeah And so beyond corp is fundamentally an access question This could be an access to something like your internal corporate wiki And the way it determines if you the user making that request have access Is it considers two primary principles one is pretty classic right? It's who you are You know, what are your credentials? Can you can you get into this wiki and then more novelly it also considers what? Devices you're coming from so a really good example might be a personal versus a corporate device right? You might be able to have your credentials on your personal device But that doesn't mean we necessarily still want you to be able to access the corporate wiki from that device Um a really important thing that's just not even in this slide is any discussion of networking You know you're talking about vpns a little bit before and that's kind of dominated the conversation And I think that networking is interesting and your network topology kind of falls out of trying to get these core things But ultimately what we're done by talking about today is access users and devices and trying to harden those as much as possible So first, you know, I said this before users were all kind of familiar with that right you are the user You've done an oauth dance. You have some credentials and and so on and so forth So it's pretty much what you'd expect out of any normal application level access question A few things we will be talking about today is how to harden that so you know How do we actually get strong user first identity? And then what are the properties of a strong credential because this is fundamental to the access question Having a strong credential results in stronger access or access decisions Um, and then the next one is we're going to talk about devices. So Um, we've discussed stuff like you know, is this a personal but it's a corporate device But once you start thinking in that direction, there's a whole kind of different set of questions that you might be interested in asking Right, what is the patch level of that device is a really good example And then how do you store this information? What kind of inventory do you have and then once you have all of this How do we again in the same way that we're going to think about users credentials? How do we think about device credentials? How do we harden the identity of those device credentials and the storage of them and so on and so forth And then finally access is the the terrible nitty gritty of given These two pairs of credentials. How do we actually make an access decision? This looks totally different depending on what protocol you're actually using So as an example if you're going to go visit the corporate wiki That's going to look like a very different access decision from a technical perspective of if you're going to ssh to a prodbox But we actually really do still care about that combination no matter what protocol you're using and then again Because we now magically Have all of this device information. What kind of device information can be used to make an access decision? So if you're trying to get to beyondcorp or a zero trust architecture How can you actually go about doing it and how do you know how far along you are? So eric and i put together a roadmap or a maturity model where we want to call it For how you get to zero trust as you go higher from left to right You your organization has more capabilities in terms of how it secures its users devices and access applications on the network Basically the more that you harden the more you're doing zero trust So your first step on the road to beyondcorp is level one is having an inventory of your users and your devices At level two you have a way to measure most of your security controls and enforce some of them You're able to segment access to specific applications At level three we have what's typically called zero trust in the market today It's tiering users and devices based on measurements and enforcing access based on these measurements So like eric said a device that's patched might be treated differently from a device that's not patched And lastly level four is what people think they're being sold It's being able to have dynamically changing risk based access to applications This is incredibly hard and something that's not really available in the market today because there are a lot of edge cases It's what we should be aspiring to as an industry, but we're all still working on the details So we're going to go through each of these level by level So level one as i mentioned it's about having an inventory of your users and devices To inventory users who are accessing applications in your environment The easiest thing to do is to use a single sign-on identity provider sso It's also a great simple security improvement for your organization This is typically tied to your hr information system So when a new employee joins changes teams or leaves then their identity and so their access to business applications can easily be updated Even if you have an sso you might not be using it consistently Unfortunately, a lot of sass tools charge you extra for the ability to use sso with their application known as the sso tax Or sometimes charge you extra just for the ability to enforce sso for their application And so for applications where you're not using sso you should ask employees to say use a password manager But that's also not something that you can enforce At this level you probably also treat employees and Contractors identities the same way although you might not give contractors access to all of the same applications In terms of devices at level one you have a device inventory and some way to manage devices This can actually be pretty complicated depending on if you have a you know a mix of both company and employee owned or b yod devices You probably have a wide mix of operating systems. Especially if you support b yod devices. I think one of my colleagues runs his own os You might also support mobile devices. So that makes your makes your life a little bit harder So although your corporate policy might be to give employees, you know company owned MacBooks If if you let them be yod their own mobile devices, you've already added a ton of complexity to what you need to manage In terms of keeping track of company owned devices You ideally have a device inventory of who you bought what device for Though this is not necessarily more complex than a spreadsheet And before you let a user connect to a network Then you might have some manual verification that that's actually the expected device And then at level one in terms of access It's that flat sort of traditional network where if you're on the network you're trusted like a traditional vpn You know and ideally can control what users and devices are on the network given that you have these inventories Even if it's manual, but you're not segmenting users or devices any further If you're only dealing with corporate devices and only self-hosted applications Then this like having a vpn plus an sso is a pretty good initial set of tools to limit access to your internal applications And now we're going to start talking about not just knowing and enumerating but actually Measuring and starting to ask questions from I don't want to say our users, but you know our devices at the very least So this is the slide where I just tell you to use security keys There was a conversation this morning Hand wave, uh, this is a conversation this morning about passwords and all of the You know terribleness and I think the password list phrase got thrown out or something like that Security keys are effectively unfishable from a browser context Yeah, you can go in the technical details of like everything but in terms of producing a good Security control that ensures that the initial authentication of your users is strong There is nothing that compares to just using web off end based solutions. So please I think that's the whole slide Right just use security's oh wait. No. Oh, yeah, and so then we also have things like user properties, right? So um, this is more classic, right you want some sort of Grouping system like active directory you want a way to say The properties of a particular user are they you know an engineer? Are they an sre so on and so forth because we're going to want to use that access information or for access decisions later Um, and then a level two for devices We're going to start thinking about management of devices now You can initially start talking about things like mdm's mdm solutions for mobile devices But it doesn't actually need to be that complicated I know plenty of people and plenty of organizations where it's like Yeah, we just have a script that we run and it checks in occasionally or maybe they use something like puppet The ability to just say Ask your device what its patch level is or say later on you might want to force it to update Having those basic capabilities Is fundamental because if you don't you just can't advance past this particular point and then yeah, uh, there's someone else on our panel today who is going to I think of a os query Sort of training os query is a great example of a tool that is open source that you can use That you can install an agent on your devices and then you can start asking questions of your devices You also want to start thinking about less of like here's this key that I You know gave you that maybe a shared between a bunch of things that lets you on the vpn and start thinking about per device credentials That you can actually associate with the device This is also where the spreadsheet isn't going to cut it right like you do need at this point To not think of your inventory as something you only update when you on board or off board a device But as something that is continuously living because you're going to be profiling and gathering information from all of your devices And then uh level two is of access management Is now we have to start thinking about not just having one single policy for all of our applications But we might want to start doing this at a per application level and there are plenty of Open source sort of solutions that do something like this You know some nginx plugins go quite a long way. There's also things like the oauth proxy, but the Level seven http proxy is a really good way of achieving this because you can't always make the assumption that all your apps are Going to perfectly understand, you know your format for your users credentials or your format for your device credentials And then you also have to start thinking about what are your strategies for non browser traffic? So this again might be ssh or command line tools while we generally think of Beyond corp as a solution for like your browser based traffic The being able to ssh or a prod box is also something you definitely want to have a strong authorization decision for And you can't just ignore these sort of access requests because they're not you know your classic http browser traffic And then this is where we also are going to start consuming device credentials So mtls is a really good place to start here, right give your devices a client certificate You can there are a lot of solutions for like android phones or that kind of thing that I allow you to Through a enterprise enrollment install client certificates that you use for your corporation And now we're going to finally get to level three, which is sort of where we see as like zero trust today or whatever So a very very common pattern Is you have a user and they have a security key and they have a really strong initial Authentication and then they exchanged it for some Token that is valid till the end of time that just goes on disk or gets thrown at some third party service, right? So iMap is a great example, right? You have this strong initial authentication and then this device just have a credential that can read the email forever. So The the really important part of user authentication is not just to think about the initial authentication But thinking about what are all of the other credentials that a user might have right your ssh keys Your access tokens so on and so forth and thinking about Time-bounding those right if you have some initial login and you give somebody a cookie that lasts for 24 hours That's great. But what other credentials have they also been able to grab from your institution? API access keys another great example where these things just live forever What we found very useful Is to take weak credentials maybe Barre tokens are problematic because you can't really hide the secret key, right? There's no secret key you throw it at someone you you check it into github and it you know gets on the open internet We found very useful is to take weak credentials and bind them to strong ones So for example if you have a device credential you might when you issue someone an api token Put the hash of that device credential in that api token And now when you're making access decisions you can sort of force a binding between a weak credential and a strong one And speaking of strong device credentials First off we're going to talk about active management. So this is where we're on just measuring We actually want to start enforcing things. So force patching great example Force reboots are another great example where you might patch but until you reboot there's nothing that's available This is also where you can start thinking about what are the other security specific configuration on your devices, right? And do you want to force these clients to do this? Device-based firewalls another great example This is also where we can start thinking more Ambitiously about how we are provisioning credentials to devices. So basically all modern devices Right my laptop my phone so on and so forth has some sort of hardware identity that you know magic pixie bucks Magic pixie pixie dust. Yeah, it doesn't change Um, so this device has a tpm in it My phone has a strong box is an android phone. So it has a strong box as a device hardware based Identity for the device iPhones have the secure enclave. These are great because you can generally challenge them for the identity and then issue Corporate credentials based off of them And that's great because you're no longer like physically and manually going and doing this But you might actually be able to automate some of this Also, all of these hardware based identity mechanisms generally have the ability to store private keys So now you're taking your mtls certificate that just had a Private key that was on disk and you are able to back it by one of these Hardware based solutions and now you have real strong confidence About when that device comes in to my network that it is that device and the request is coming from that device And finally, uh, we're going to talk about access and there's a general term that google has published called the tiered access And the the sort of summary I want to give is this we've talked about personal versus corporate But anything beyond that we're not going to start talking about device tiers, right? A personal device is interesting in a corporate device. But what about the patch level of that corporate device? What about other security relevant? Configuration if it hasn't, you know taken up these particular things we've prescribed to it Do we want to continue to allow it access? So when we talk about tiered access, we generally talk about devices losing trust So this is a device that is a corporate device that should have access But doesn't because it hasn't done something we want of it. For example, it hasn't patched recently And critically the most important tier is the tier that your devices get placed in when they lose access Because if they just suddenly lose access And your user has no way of actually Regaining that back. That's an extremely bad user experience and you're going to get some angry messages So level four So what we're going to describe at level four is what people think they're getting right? So they think that something that implementing a zero trust architecture means that you have dynamic risk-based access to all of your applications for all of your users on all of your devices and That's not real That's not possible today There's a long tail of things that are hard to get right or that are not yet solved for like a normal company To have a zero trust architecture Specifically, there's a couple things of the long tail that are hard SaaS applications truly risk-based access device state And just random network devices. So let's dig in So this is a friend of ours Matthew Garrett who Left google to go, you know work at a startup and realized Oh my god, all of the services my customers or my clients use are now hosted web services and starts to think How do I do zero trust there and the reality is that this is really hard when you can't put a service behind the proxy And I think uh based off the second tweet Matthew's solution, which I think is a good You know if unsatisfying one is that oftentimes your sso that you know corporate sso that you pay texture to get Now becomes the access point. So when you go and log in through this and get bounced to your sso provider That's run by your corporation. It now needs to make intelligent decisions about when I issue you that, you know, sammel I forget the assertion or the oidc token You know, can it is does this device have a client certificate? Does it have is it correctly patched? And do I want to now allow this to access the third party sassap for some, you know given amount of time Yeah, and in one workaround that I've heard and I really don't like it But it's a workaround is to is to have them Pure have the sassap's peer to network so that ingress traffic to the app is only coming from allow listed IP addresses that are presumably your corporate network So an employee who's trying to access, you know workday or senate fits Whatever happens to be needs to pass their traffic through your corporate network out the other end to that to that third application Some security teams like this because they want to inspect the traffic or enforce certain things on that traffic But it is kind of shitty for the user, right? It's like much slower You have yeah, anyways, so in addition to being slow, there's a couple of security downsides that are not great This is completely a step backwards in time, right? Like you're going back to having a flat network and you're saying that anyone who's on the network can access these sass applications Which is not what we're trying to do Um, it requires the sassap to have this functionality Not every sass application lets you have IP allow listing to a set of IPs And you probably also want a way to correlate audit logs from the sass app to your audit logs To know who's actually accessing the application what they're doing And you also don't necessarily have fixed ip's right your network changes You migrate data centers or more likely you're using a cloud provider And then you're allow listing like all of google's public ip's or all of amazon's public ip's and that's not really Closed anymore to anything that's like you've allowed listed like a lot of the internet to talk to your to your internal applications So relying on ip trust isn't the best way to deal with sass apps either And I think the temptation with tiers is to have a lot of them People want to do this they want to say you know more than just you have you know privileged access plus some middle ground plus no access And this slide is sort of one of my friends who used to work in mozilla was working on some of the ways in which They were going to allow they allow Individual applications to prescribe what the what the requirements are for their own service So there's two main things that you hit one is that every service now has to be aware of like what tier level it accepts This gets really messy when you think about things that also might have different levels of tier a bug system Is a great example where some of these might be extra restricted Additionally the more you measure particularly from operating systems the more you realize that everything is broken and filled with like held together with duct tape You know the version number changed and suddenly all these people lost their access these type of things happen regularly because Ultimately osprey is cool But like the operating system vendors should be providing us that information in a way that isn't going to break because somebody tweaked, you know a string And then I put this in basically all of my slides as well This is the nsa Taking over a sister router and doing god knows what to it State is great until your device starts lying to you right. Hey, I'm patched and I'm cool and you can let me in The reality is that there are device state attestation mechanisms. We talked about stuff like tpm's Um, they work really well for closed ecosystems, right? iPhones can are pretty good at detecting if they're broken or not For when you get into more open ecosystems like microsoft where they have to support all of these random devices I've looked at tpm event logs and I have no idea like Is this the thing hacked or is it not because to know that information you'd have to work with one or two or three vendors just to get the right hashes as an example And the last challenge we're going to talk about is random network devices Some devices like thinking of you printers are just a pain to deal with and manage on your network the same way I'm not talking again. I'm still not talking about production networks still talking about the corporate network It has a lot of devices Printers don't have secure boot printers, you know can't be managed by mdms The easiest way for you to set up your printer is with a fixed ip address and then scan your land to find your printer And then use it But I can't you know, it's still a device on the network and I still want to be able to verify and manage access to it in the same way But I can't this comes up More in brownfield than in greenfield deployments because you know, maybe if you're at a startup you can just If you're trying to adopt a zero trust model, you can just not have printers But existing legacy networks will have a harder time With dealing with these random other devices So then kind of waving a magic wand and looking at you know ignoring those challenges We just talked about if you could if you could have level 4, what would that be? It would there would be a decent way to manage users and user access to sass applications and correlate logs for when users access those applications Probably there's only two okay partial solutions today One is to host everything yourself Which is not super realistic But you do see come becoming more and more common with like larger infrastructure tools that offer on-prem solutions Or the second one and I'll add on to To Matthew's list is you can be Google or you can be Microsoft and you can use your single sign-on to access everything Fundamentally as an industry we don't have a way to deal with this and we need a better way to deal with portable credentials for third-party sass apps Ideally also including device credentials as Eric has pointed out In terms of devices You would need to include all the devices that are on your corporate network because they're all a point of entry to your network Usually the easiest way to address this is actually To move these devices off of the corporate network rather than do whatever they need to do and meet the requirements to Remain on the corporate network And as Eric explained, you know having hardware based device out of station from a root of trust that you can actually trust That's still a very hard problem And lastly with all that info making real-time decisions that aren't just rule based But that change based on what you know about the user device Quickly enough that the user doesn't notice or get frustrated and so that's it right? But as you've learned that's that's not exactly easy So to recap what we talked about today We proposed different capabilities On your roadmap to adopting a zero trust architecture that looks a little bit like google's beyond corp At level one you are treating the vpn as an access control point for the applications on your network You can enumerate users and devices You're using an sso to inventory users and you have a manual way to list devices These are table stakes it and security capabilities At level two you have per service authorization For example by using a proxy you can measure and enforce you can measure most and enforce some security controls You have a vpn for your network an mdm to track and manage your devices and your users use sso and mfa ideally security keys This is where most security focused enterprises are today At level three you're all in on what the market calls zero trust architecture And you've maybe even bought a solution that builds itself that way What you can do is enforce tiered access to applications based on device characteristics And level four is what many people will aspire to but no one has yet to fully achieve You want to be able to dynamically enforce risk-based access to applications And there's a long list of of user device and access issues such as sass apps that are very hard to get right today I think um The wrong takeaway from this talk is to look at this list and say I'm going to do a check mark and get to level x right Beyond corp is never really done. It's more the friends you make along the way right like we're friends You want to be focusing on these core pillars and it's that's not a vpn It's users devices and the way you combine those two pieces of information to determine their access and That in and of itself brings a huge wealth of security benefits Regardless of what level you're at or if you ever claim that you've accomplished beyond corp And that's it. Thanks so much for joining us and drop questions in the slider Thanks, so those were three great talks in about 10 minutes. We're gonna set up the q&a You can still ask questions on slido and uh, we have many questions. It's going to be a really interesting discussion Stay tuned so Welcome to the q&a. We have 30 minutes of Of questions and we have a lot of them So I go into too much of an introduction, but I'll still introduce a guillaume Who was doing the workshop on the fleet and os query So guillaume is the head of security at fleet device management the company behind the open source fleet management platform for managing and using os query While he prefers working in startups He's been working in security forever in organizations of all types and prefers working at a bright side of things And things that work instead of repeating 30 year old best practices that never have All right, so the first question the most upvoted is uh, I think mostly for caspian What is the impact on mental health of cumulating incident response and what can be done to better handle it That's uh, it's a great. That's a great question. Uh, so It kind of depends on what the level of incident response people are working at And how often they're dealing with it But in a lot of cases your first incident is going to be stressful If you're dealing with a lot of incidents It's going to be stressful And sometimes you're going to see things that you absolutely do not want to see hear things You don't want to hear and experiencing things You know that you won't want to be there for so I think the mental health impacts can be kind of high and it's important both for employers And people running secure teams and managers to understand that the folks that they're putting on a line to do this actually do You know experience that and to have that awareness and have that support in place when it's possible All right, thanks Any tips for capturing momentum after an incident to guess to get better practices in place so glad so like In post incident. Oh my god, are these all for me? I feel really special right now Yeah, a couple one Post incident basically take what you've learned from the incident It's going to make a huge difference if you can say, okay, you know, let's not waste this incident Let's learn something and let's build from there Um, and in terms of capturing momentum from the teams like it also kind of goes back to a mental health thing Right, you you basically got knocked down. You have to get up again. Oh wait, that wasn't yeah I was going to go on with this song, but uh It's not a great situation and a lot of times, you know, you can use the incident as a rallying point for people But the most I think the most important point is to not point fingers, you know To not turn this into a situation. We're like, yeah, so-and-so is responsible for this So they should you know, they should get fired or demoted or whatever. That's not something you should ever do in these situations Yeah, I think we this the blue teams in general they We have a reputation of pointing fingers that maybe we shouldn't be doing or maybe we don't do it, but I think people external to the the security people they they see security as Pointing fingers and it's bad for the the business or the industry as a whole Yeah, and I think so I've worked with Guillaume in the past in a couple of In a really fun role at a former employer and I've heard a lot of his stories We're kind of like losing it and getting angry but being positive and I think that's actually kind of important in these situations Like be angry at the incident not at the individual I think to add on to some of what you were saying about, you know, not blaming a particular person or etc Like google has a culture on sre's of blameless postmortems after an incident happens And I think as much as possible that that also applies to security like something bad happens Document what you should have done instead There's something that's critical that you absolutely have to go fix And it's not someone's fault that I don't know the alert system failed or whatever, right? There's nothing more There's nothing that'll get a proof of concept built faster than some terrible all hands on deck like security Something happening, right? I wouldn't point to specific things but over the past few years like you could imagine Those moments where it's like everybody stop what you're doing. We're working on this one thing Like those are often times where you end up getting to proof of concept the things that you wanted to all along and You know, suddenly it's running on every machine and you're like, okay, maybe we could do more of that And that's a much easier place when you've kind of proven that out out of Panic more than anything and then going from the the proof of concept to actually Implementing correctly is another topic, but it's really complicated as well I have a question for either Eric or Maya So you talked about the the security security keys and web and so It's a fairly new technology, but the technology is there. It works And I think the the adoption is quite slow and My question is I think I think people A lot of people even security people are not aware of the the benefits So the biggest one being it's basically un fishable. So how do we Speed up this process of getting people to know It exists. It's very good. You should you should get through that at some point I would say that I have a much easier job because I work in a corporate environment where you can sort of just say like Hey, we're going to use these like, you know We start enumerating all the places where we don't use them. We have exception processes for that But if you look at like what the phyto alliance is trying to do They're trying to get the whole world to get onto that and you've seen things like their white papers of trying to have Even like soft phyto tokens that you can move around between devices and back up to your like cloud, right? So I think that these are kind of two different topics, right? Like a corporate entity is going to have all of their weird little things that you're going to have to Know that okay, this doesn't use web off end or it wasn't used to security key To get the entire world to use security is key is something I'm dramatically under prepared to That's to speak about. Yeah, I'll add a couple of things. Um, one is that your sso Probably has a control that says something like require security keys so yeah, you just like prep your your Employees to be ready for that and send them all security keys and all that kind of stuff explained that you can put Multiple credentials on the same key like that they don't that they can have lots of keys. They all work like all that kind of stuff So you need to spend the money up front on those keys and flip that switch But that's absolutely something an enterprise can do The other thing I will say is I think some of it's going to be forced on us in a good way So github announced a couple months ago that they're going to force all developers to have 2fa not security keys But just 2fa baby steps. Uh, I think in 2024 or 2023. I got an email from them. It's time It's like really really time like I'm I'm shocked this hasn't happened before but a lot more people are going to just start using second factors and security keys and Adoption will continue And about a github, I think I think they said that uh, I think 15 percent of the users have 2fa which I thought was really low because their developers They they're closer to security than most people and they should be aware of the benefits of it, but I don't know the 15 percent but the last public number I saw was 11 percent of projects that have over 100 developers um have developers who use security keys Or sorry use 2fa because I don't think github tracks which ones which that you're from that they publish. Yeah All right question for urie. Uh, can you successfully distinguish obfuscated javascripts? uh from minified ones because I think uh Minified javascript is pretty common in In the web as a whole because it reduces payload size and there are lots of benefits. So how does your uh, Your model Do with that? Yeah, uh, it's a great question. So let me first explain a few words for those who are not sure what are the differences So unification Means that you do not intend to hide the code, but rather uh compact it So uh, usually the purpose of minification to make the code run faster And and then it just changes the names of variables mostly and like compresses the code to be on a single line So as you one of the examples I showed was indeed minified. So In our case, first of all, uh, the minification of itself We didn't see um an interesting cause to distinguish between the cases. So From our point of view minification is just clear text And and this is why We did so so I did try To create a model that do a multiclass classification like three classes minified clear text and obfuscated And I didn't see that these model performs better But but again the point of view was to distinguish ultimately between obfuscated and clear text And this is why um So so in this case it didn't work better than that But in general to sum up, uh, it's possible to um to uh, you know spot minification by machine learning very similarly to what I described Before in my uh in my presentation And if if you really want to then just you know, just use a minifier Create the data the data set as you like and just run a similar model and I guess it will work Thanks um question for I guess anybody that Work in incident response or in a blue team given an incident Have you ever considered letting the threat actor operate in the hopes of capturing more tools and tradecraft? Is there someone who has a story to share about that? Can you talk about the story? um I don't have a lot of stories to share about that that I can talk about but I That is definitely a thing people do. Yes. Uh I mean I currently work at a place that mostly handles ransomware incidents. So tools and tradecraft We have a really big intel team for that. We've got you know, we're basically that's what we do But yes, I have seen that done The game that you're playing with that is a dangerous one because you are basically saying let's let this roll out further and you're doing that hopefully with legal oversight because There are a lot of questions that are going to be asked after the fact if you basically let Any threat actor wander around in your network and start messing with things? Yeah, and it's a big uh, it's a big gamble because Uh, I guess one uh, one of the good reasons you would want to do that is if you If you suspect the threat actor has many footholds you want to identify them before Before notifying the threat actor that you're onto them. So it's uh, it's a really tricky questions And I don't think you can have a general answer it really going to depend incident on incident No, I I mean really this comes down to a visibility piece again And this is one of the things that I think a lot of blue teams struggle with is kind of being able to see As much as they can on the network and and you know There's also the controls which I know you guys were talking about because I got to see that So there are elements of this where We're only just now getting to the point where we have the technology to maybe see this as well as we want to But there's a lot of resistance to putting that in in places All right a question for Guillaume At what size of company does something like oscar query and fleet I guess Start to make sense I would say if you're um, if you're remote and you have computers that are owned by the company as opposed to byod Like 50 people 30 people 50 people it probably starts to make sense. I think as soon as You start having enough systems that You're you're wondering what's happening on them, right? So if you if you're like a dentist's office and you've got three computers and it probably doesn't make sense But you don't need to have A thousand Definitely under a hundred make sense And is there would there be an upper bound like if we have 60,000 machines is it is it too much? Should we go another solution? No, I mean the thing with uh with oscar is There's like 300 different types 300 different tables with data that you can get from Some of them can be extremely Generate a lot of data for example There's a table that will tell you about every single connection that's established If you run that on a linux server that's a load balancer and and like you're you're on the internet And you get a billion connections every like it's probably not the right tool Um, but I've seen a lot of deployments with you know 100 200 300 000 servers And it scales really well. It's more about Where do you put the data after? But it lets you be very um precise about what you're collecting. You don't have to Grab everything and then decide what you're going to do with the data later, right? You can decide Okay, we have a new use case. We need this table. We're going to collect that every 15 minutes every hour every week And then start using it. So it's not the same type of You know, let's grab everything and figure out what we want to do with it later, which can be very expensive I guess, uh, you know, whoever sells a License based on how much you can index where sells hard drives loves that model But we don't think that's the right model in most cases, especially if it's a big environment All right, thanks Maya or eric, I guess What level of logging do you think should be in place for each level of? Of zero trust. So I guess this references risk-based Access I mean I don't know exactly where but it but like I think the a lot of what we were trying to explain in the talk is that This is so complicated And like logging when you talk about logging is not one thing right like device logging endpoint, you know, um With like stuff like or you know, and you have stuff like os query, which is kind of logging but kind of not Visibly these hard. Yeah. Yeah, and um Yeah, so I don't know if I have a great answer of like what you should log. I mean Experience wise. There's also this kind of weird Interaction between like you need different kinds of logs now, right? Because you're not just gonna have All of your things on the corporate infrastructure and you can do that right where you can You know say, oh, yeah, here's all the ip's that came in and out So I would say that logging is just like another one of these wonderful challenges that you'll experience Um, and it's it's not really a good answer. I mean log everything. I think it's the answer I mean, is it uh question is there an answer to response where you you had too many logs like Yes, yeah, absolutely. And and the thing is like I will always say log everything like it is We have too many logs sometimes that takes us a little bit of time There are really good data engineering tools. They're really good databases for this stuff So if you've got the right tooling in place, it's not a problem Uh, I have been in places where you know, I keep making this joke about grep like it I've done it. It sucks. Uh, and it takes a long time, but I Too many logs operationally as a problem too many logs in terms of an investigation never is there's there's there's no such thing Is too much evidence Grip is the best tool ever I also say to like Logging like on traffic on the network Is maybe a little less interesting than like I have an application proxy and it knows exactly who is connecting and what device they're coming from and So, you know, I don't think it's a matter of like more logging or less logging It's like sometimes you just get better quality logs out of these kind of solutions too I will add on to that I feel like a lot of people have built these like proxies or load balancer proxies Whatever they happen to be and they're like wow now that all my traffic goes through here I'm gonna do all kinds of other stuff to it too and you're like that's not that wasn't the point Like you can solve that problem in other ways. It's harder, but that's not you you solved a problem in the same way because you already had someone to put the logging um I mean if what you what you really need at the end of the day is going to be some sort of audit log of like this user access This application at this time and did these things And there's lots of ways to do that. Uh, it happens to be easy with a proxy But there's lots of ways to do that For Yuri, have you found that a lot of non malicious JavaScript is obfuscated to for uh, interdictorial property reasons? so Do you do you have like numbers or something a You you said it's a valid use case, but how how widespread is it? Do you know? You mean the um the intellectual property thing Um, so I think in one of the papers the dub dub dub 19 They really did a deep analysis of of uh Of the distributions of javascripts in terms of what happened there Um For our data in imperva and as I mentioned it wasn't really feasible to uh tag tag it manually so we use some Some heuristics just just to test the model But it's not it's not something that we can you know accomplish in any way if we have just an example like Almost half half of 100,000 scripts as a test set I have no way to take security research I mean we have many But I can't really ask them to tag to tag it and to say which one are by chance, you know intellectual property, so obfuscation type Yeah, you would have to reverse them and then yeah steal their intellectual property or something like that um About machine learning in security we have a comment here It seems that everyone wants to use machine learning for stuff like id system security Is it worth it or can can it be just a waste instead of boring things like good logs? So i'm going to rephrase it. Where do you think? Machine learning fits into security As a whole as an industry as a as a part of doing business Yeah, it's it's a good question. So so Generally if you ask data scientist, what is machine learning good for something he will tell that most probably yeah, it is it's very good So so I think there are many applications for machine learning today as as you know The algorithms become more more sophisticated and more more Libraries you can just download and easily use and and you don't really need to understand the inner working of the models So but but indeed there are cases where You don't you you cannot use machine learning. Let's let's maybe talk about this negative a negative example So when when can't we use so there are some cases that we don't have enough data and there are Cases where the data is too dirty. So you need to make You know It always starts with the data when we look on the data We can see if it has to be cleaned if we have enough of it if we have a very unbalanced data set So these are maybe like the corner cases where If you just take an out-of-the-box algorithm and apply it on on on such data You may be come up with something strange So there is no you cannot really replace This pipeline of of gathering the data and and a good amount of it If if you want to solve like anomaly detection and you have 10 examples Then you know you don't have to use machine learning. So that's I think my take on this as a rule of thumb I'll add on to that because I think the we talked a little bit like risk-based access And I think people like oh, yeah, you can just decide whether or not It's one kind of accesses application and just use spring to some ml on top and like It's exactly what you're saying. They're like anomaly detection is a great example That's actually kind of bad for machine learning Right like if if machine learning is learning is good at a couple things One of the things that's really good at is sorting things into two categories a and b That works well if A and b both have lots of data If you only have 10 examples of attrition attempts to your system Like you just can't train a model on that, right? It's not possible And so something like risk-based access you don't want to put that in front of a user Who's going to like get even more upset if they can't access their email or do their job or whatever it happens to be like Sometimes you you can do some of that stuff after the fact to like look at logs But even then like you don't have enough data to train a model if there's just nothing to go on Yeah, I think this thinking comes from The fact that a lot of security people have very little experience in machine learning and they Either go to the side of ml can't do anything or ml can't solve everything and it's uh We need to talk more to data scientists about about this stuff. So So so so we can know that ml can do this but it can't do that And just trying to maintain Even getting your signals to fire in a big system like I talked about the the quality of you know stuff you get from an operating system even just ensuring you're like like I do a lot of development where you have like nice hermetic tests and you can say like You know does this run if I run it against another system like a big Corporate infrastructure, you're not really going to have those intense very option and you know signals rot pretty quickly So even just getting signals to fire. That's like before you even can take that structured data and do machine learning on it That's already a hard enough problem, right But but I still think that we don't need to discourage people from from doing machine learning I think that it's uh on the contrary. So um, I encourage everyone to you know download some some libraries to to be Proficient with pandas and scikit-learn and stuff like that and you can see many many examples when you can do Really cool things pretty easily and I think that will help you to understand What machine learning is good at and what's not really and it's much better than you know Just like using buzzwords as as Maya said just sprinkling them That's a great advice even though it's spooky. I don't want to pip install scikit-learn You just you just have to wait a little bit right for everything. Yeah. Yeah, it's it's kind of a big package um All right, probably last question maybe another one after that. Um I I need to ask because I saw on the on the on twitter two weeks ago interview questions for a suck analyst role and the interview questions were like Uh, uh, you you see on the windows even log even id 486 24. Uh, what do you do? What what does this indicate or you see error code windows 11701? What is it? And I thought Uh, this was really bad. You didn't know what these things mean. Oh come on Yeah, I've been I've been working two years in a second. I would google these these questions I actually if someone asks ask me this in an interview for this role I would probably walk away from the interview So I have actually asked that in interviews, but it's usually what's your favorite event code Not, you know, you see this and then what do you do because It's not fun to put people on the spot And uh, I think I didn't mention it was for a junior analyst role. So So yeah, so my question would be Uh For for blue team roles, uh, what would be or analysis role or ir roles stuff that requires analysis and deep technical knowledge sometimes What would be your your your good interview questions? I love that you're all looking at me now What's your favorite event code? Uh, 46 24 really honestly, it's yeah Um, I'm gonna stall and give you all some time So a couple things that are bad with that interview question is it tests specific knowledge that someone has to already know which is bullshit It doesn't show that the employer is like willing to train you on the job or like help you learn anything again, which is bullshit Um, and like the only like we we literally went to trivia I don't know two weeks ago and one of the trivia categories This is EFF trivia So like take that with the greatest all one of the categories was like cli commands for deprecated systems And like I think our team got every question but that was embarrassing and I contributed nothing to that Absolutely nothing to that. Anyways, so I stalled now you have some interview questions To me it's more about the process that someone's going to follow to Get more information, right? So okay, so what would you do if you saw that like even if you told me Well, I'm going to google the event codes because obviously that's what someone's going to do That's uh, that's a good answer, but then Maybe I'll tell you what the event code means. Where are you going to find more information about that? And then I feel like even Questions around the networking sometimes can be useful just to know if if someone has like a decent understanding of how networks Work and it doesn't have to be like, oh calculate the subnet of whatever you've got three seconds That doesn't really matter. Here's an IP packet. What did the checks on? Yeah, exactly But it's more about just like The basics and then you're you're curious and your process for getting down to Finding it is is good. Like troubleshooting is the same thing, right? Like if you just want to troubleshoot something you need to eliminate Theories until you get to what's the thing that's broken and when you're investigating it's it's pretty similar So to me I'd rather hire someone with less experience that seems to know how to Look for things and come up with theories and look at that then someone who can recite windows event IDs by heart So yeah, I agree completely with this because Typically if I'm doing this if I'm running the the ir team or the sock I'm not going to be looking for somebody who is Going to be able to calculate packets. Like this is what google is for right, you know Skill testing questions are nice Process is more interesting. So don't you know don't be able to answer things by heart, but understand how you're working I also say as one of you do is a lot of interviews for my employer like We this this sounds like advice that you'd be giving to like somebody who's newer But like we've read your resume who like we know about what level you're supposed to be We've assigned someone to you who we like knows a little bit about that domain So they're like there to have a conversation with you So I think the the biggest advice for that is just like if you put something on your resume please be ready to talk about it because Weirdly, that's like how we determine how we're going to have a conversation with you to figure out all these things of like You know, okay, here's what you know But like, you know, are you interested in this or you know? Are you clever if like we give you these two pieces of information and you know It's something about this domain like you know, can you put that together? and yeah in one of my first interview for an internship when I was in the university I uh Someone asked me about a personal project. I put on my resume and I was like Pikachu face surprised and I was glad someone I was ready to talk about it, but I didn't expect someone to uh to uh to ask questions about about all the aspects of my resume, but it makes sense once you're uh, you're uh You're used to interviews So, uh, this is all the time that we have so thanks everyone for the great community session And we'll see you tomorrow