 Okay, our next talk is going to be by Hanno Böck in search of evidence-based IT security and he wants to do most of the introduction himself So this is a very short and brief moment for me up on stage. Enjoy and give it up for Hanno Yeah, hello. So I said I'm Hanno Böck. I'm working as a journalist and a hacker I'd like to say I like to avoid the term security researcher and I hope during my talk It will become obvious why that's the case I write articles mostly for Golem usually about IT security topics and I run the fuzzing project where I try to improve the security of free and open source Software and this is funded by the Linux foundations core infrastructure initiative and I also write a monthly newsletter about TLS So as I work in IT security, I occasionally go to security conferences Not just conferences like this one, but also conferences where you have a vendor area where people are trying to sell IT security products So I have a few pictures here. Here's someone is selling next-generation APT defense Here's someone is selling something with artificial intelligence Someone is asking everything is moving to the cloud. Why isn't your security? And this vendor is saying the only render with grant guaranteed protection from ransomware And when I see these things I am a bit skeptical I'm not sure I feel many of the terms don't have a real meaning they feel like marketing terms I don't really know what they are doing or if I know what they're doing. It doesn't feel right And I'm not the only person skeptical about IT security products So I don't know if you know this guy this Tavis or Mandy he's working for Google and lately He's been looking at security products and what he found was that many security products are not very secure So for example, he found out that Avast was using some open source code and they replaced Strun copy with still copy and introduced a buffer overflow, but it's faster. So great Trent micro action accidentally left a remote debugging server running Our Apollo alternate work had a memory corruption because they shipped a web server that was no longer supported And here he was trying to contact AVG and said also your code makes zero sense Yeah, and we have headlines like this where PC world says antivirus software could make your company more vulnerable and On the upper right antivirus tools are a useless box tick box ticking exercise says Google security trap and Here are two tweets where there was recently a quite heated debate about the value of antivirus software Just in shoe. He's a chrome developer. He compared antivirus to homeopathy and April King who is from Firefox said antivirus cause pile of security issues for Firefox And this is from a very Google asked Users and IT security experts what they think is are the most important things to do about IT security And you can see the users had antivirus software as the very first thing And the security experts don't seem to think that's so important. It doesn't even show up in the top five so We can conclude that there's a considerable disagreement whether IT security products and especially antivirus software is actually a good idea so how do we know actually if these things work and And to investigate that I'd like to talk about something completely different So this industry likes to use medical analogies. We're talking about viruses Viruses are usually something from medicine which affects people And here's another form of antivirus It's a vitamin C pill and here's a person having a common cold sneezing and Yeah, many people think it's a good idea if you have a common cold that you should take vitamin C pills Unfortunately, it's probably not very useful And why do we actually know that? Obviously we know this because we have science. We're using science to investigate whether things work and For the vitamin C Here's some quotes from a study Oh, it says okay regular ingestion of vitamin C had no effect on common cold incidence in the ordinary population So if you're like an average adult person and take regular vitamin C, you're just as likely to get a cold than everybody else However, it may be that it shortens the duration of your cold a little bit But if you take the vitamin C only once you already got a cold and it has no use at all And the study here is from the Cochrane collaboration, which is an organization that's doing So-called meta-analysis and I will come back later what that is, but it's generally an organization I think there's widespread agreement that the Cochrane collaboration is creating some of the highest quality scientific evidence in medicine So if we want to know if Medicine or also like something like a food supplement like a vitamin C pill works What's usually done is a so-called randomized controlled trial And that is we just take a group of people that may have some some illness and then we split them randomly into groups And it's it's crucial that this is done randomly because we don't want to have some Statistical thing that we chose one group That is maybe more sick than the other group to begin with and then this Screws with our results. So we need to split them randomly into groups and a simple way would be okay One group gets a medication the other group gets a placebo and then we see what happens In reality, it's usually more complicated because usually we have a situation Where we already have a known good medication and we have a new medication and we just want to know if the new Medication is better So we compare an old and a new medication and we also may have an alternative to medication like Exercise or dietary changes and we may want to know. Okay, maybe we have a medication that works But doing exercise works even better and maybe taking both the medication and exercise at the same time works even more better But this is the general idea. So we randomly split people into groups and test what happens And then we usually don't really care about a single study because they are far too many things that can go wrong So what we usually care about is all the scientific evidence we have as a whole and that's why we're doing a meta analysis which is we're trying to search for all the studies that have Been done on a particular topic ideally randomized controlled trials and we combine the results. This is Obviously sometimes complicated because we might have studies with different groups of the population They cannot always be easily compared, but that's the ideal idea So we have many studies and then we combine the result and look at the whole body of evidence Yeah, so we call that evidence-based medicine so ideally We want to make all decisions based on high quality scientific evidence Which very often a meta-analysis qualifies Now I want to point out that One shouldn't have a too idealized view on science because there are many problems Here the top one is actually the most popular open access paper of all times Which says why most published research findings are false. It has been published in 2005 And that's actually not very controversial. So and The middle one points to an issue that has been debated in recent years a lot where There was a big experiment to try to replicate studies in psychology And they found out that they could they were only able to Replicate the result of 37% of the studies So the majority of studies it seems either they are wrong or the replication was wrong But it seems there's a problem But this is not only affecting psychology The you have the same problem in many fields of sciences. For example, there's a very similar debate in cancer research And the lowest one points to something that many clinical trials findings never get published Which is also a very important thing to consider that the science we're seeing is not all the science that has happened We very often have a situation where people do a study and Then based on the result they decide whether it's interesting and gets published or whether it's just gets thrown away so this yeah so If you want to evaluate what's good or bad science Then there are some things we can look at Something very obvious is if we have a very small number of research subjects So sometimes you see studies where people say, okay, we've tested this with ten people Then I say, okay, that's maybe not very meaningful. It could be just coincidence could be a statistical glitch Then which is a very common thing not only with the quality of the science itself but also with the reporting about science like when the media reports about it is that Correlations are reported as if they were causal results So what's happening here is if we have a set of data and we may find out all the people who have property a Also have property B. Then we could conclude. Okay, a causes B But it could also be that B causes a Or it could also be that there's a so-called confounder Which means we have something completely different that we may not even know about that's causing both a and b so and This is generally a problem in all studies where you're using an existing data set and trying to find something in it And that's why we are doing these controlled trials where we are splitting groups randomly into two groups So we can exclude that there's some some other factor that's happening here And then yeah, sometimes we only have a single study or very few studies. So We usually want good science to be based on many studies. We want science to be replicated independently and Then we have a thing that's called publication bias and that's what I mentioned earlier with we don't see all the studies that are done and we may have a situation where Pharmaceutical company makes a trial on a medication and it turns out the medication doesn't really help and then they don't Publish the trial and then they do another trial and there it seems like the medication helps and then they publish it And you can see that like if you only see the positive studies And you don't see the negative studies then and yet then you try to combine these results like in a meta analysis then you get a skewed result and another problem is Called outcome switching and it's kind of related to fishing for results Which is you may have collected some data and but it doesn't match your theory, but then you could try Okay, maybe If I just use a sub selection of my data I might may maybe I can prove something similar to my theory And if you look long enough, you can take some random data and you will find something that looks like a scientific result So It's not generally a problem to do this But you should be transparent about it if you were first searching for something and then later you're searching for something else you should at least make clear that you did that and Ideally you would want all All these studies there are somehow based on statistics there are empirical studies You want them to be pre-registered which would mean that you would publish before you even start collecting the data What you're about to do so you could say yeah, I want to study this medication. I'll do a randomized controlled trial with these groups And then I published that in a trials register Because then if you later change what you were studying then other people can see that so It's transparent, but we're very far from that in medicine. This is happening. Usually it's still not ideal There's still a lot of problems with this, but in many other fears. This is not happening at all Okay, now let's get back to IT security Here's an empty slide and it's not a mistake It's intentionally empty because it's also the complete list of all randomized controlled trials that have ever been done on security software There are some people who are doing something that may look a bit like scientific tests of antivirus software But I feel the methodology that's used there is extremely flawed So what they usually do is they they have a collection of malware Which is hopefully somewhat representative of real malware and then they try which software detects it and which not This is a lot of problems because for example if you detect the malware That does not mean if you wouldn't detect the malware that it would infect the user it could be that the malware tried to use a browser exploit and the exploit is only in an old version of the browser that the user is no longer using and They usually completely fail to to consider the idea that you could do something else than antivirus software to protect yourself Like there you could use regular updates and application whitelisting So these tests kind of have the idea that Antivirus is the only thing you can do and the only thing that matters is comparing different products against each other They usually don't consider that antivirus software itself could be a security risk But most important of all the they usually don't test with real users They are testing in some kind of lab condition where they say they are simulating what a real user is doing But they are not testing with real users And it's it's quite widespread that you see Question of forms of statistics in IT security One very notorious example is also CVE counting when people say so CVEs are I don't know if everybody knows that these are identifiers for security vulnerabilities and what some people tend to do is say okay windows had that many CVEs Linux had that many CVEs so clearly windows is more secure than linux. This is completely flawed because these CVE identifiers don't even try to be complete and If you don't believe me there's a talk from the guy who invented these CVE IDs where he thinks you shouldn't do these kinds of statistics So My feeling is that IT security is largely not biased based on scientific evidence And this is a bit of something that bothers me because I work in IT security and I'm a very scientifically minded person So when someone tells me hey, this is healthy Then I say do you have some studies to show me and if you don't have the studies, then I don't believe it And at the same time I'm working in a field where if I ask this question the answer is just the evidence is not there very often And now you might say okay, but aren't there plenty of scientific papers and conferences on IT security and Here's a list of some of the most cited papers And a quick remark on that like counting the citations of papers itself is a very controversial thing But I cannot go into that but there's a whole debate about whether you should use something like an impact factor or whether that's a bad idea But at least I think it tells us which are the scientific papers that other scientists care about And this is a list from Google Scholar from papers from 2012 and 2013 So the first one here says candidate indistinguishability obfuscation and functional encryption for all circuits Now I could ask if we have a Average user who is using the internet writing emails using a web browser using Facebook, whatever How does this matter for him? If you have an answer for that, I would really like to hear you talk to me later And I think you could ask similar questions for all of these papers I had to go till number 11 where I found something that sounded like it was actually about real software that was a paper about Android malware and Also at number 20 I found another paper that was about real software which was the lucky 13th paper and that one Made me kind of question myself because this is the kind of paper that I usually care about Because I do a lot of crypto stuff and this is a crypto attack It's a timing attack and it's really hard to pull that attack off and It's so hard that I'm almost certain that this attack has never been used in the wild to attack a real user But these are the kinds of papers we find interesting because we say oh they were able to pull off this Interesting attack that's great that questions all the way how we did encryption And it had a pretty big impact Yeah, I'm also proud that I found a little mistake in that paper actually It's not very important, but so yeah so one I Yeah, but in this whole list with 26 papers that were the most cited papers There was not a single paper that was doing anything with real users But they were trying to see what's happening when real users act with the internet do something about security So it seems the user is not really something IT security research cares a lot about So my feeling is most academic research in IT security is comparable to basic research When we talk about I don't know homomorphic encryption or indistinguishability obfuscation These are crypto theories that may lead to some interesting products far in the future And that's fine. I mean basic research is totally fine But I feel we're completely missing the applied research And if we do like the more if we look at the more practical research I feel it tends to go into Interesting sub problems, but not the most important problems Which is also kind of fine, but I feel there's a there's a whole big area. We're missing here so What would we do if we would say we want to do a randomized control trial on a security software Because okay get a large group of users and randomly split them in groups We have some groups that use some different IT security products We could say one group uses some alternative treatment which could be something like Applying regular updates and doing application whitelisting which is generally considered the most viable alternative to antivirus software And we could say one group gets a training where we say, okay, don't click on these email attachments I have to say here. I don't think training users is a very very good strategy But I think we should test that anyway And then we could have a placebo group where we say just do the same thing you did before And Then we try to measure security incidents which may be tricky to even decide when a security incident happened Then we could also try to measure what side effects does this have do things crash do we things get slower? What does it cost? Do we have some downtimes and Then after some time we compare the result Now I have discussed this with a number of people before I did this talk and the first reaction that usually comes is some form of this is really hard and there's this problem and that problem and that problem and I Totally agree. This is really hard Science is hard. That's just how it is, but it doesn't mean we shouldn't do it So some problems that would show up for you could ask What's about the ethics of such a trial because you would say you give some security products to some people and not to others So do you put them at risk? But if you think about it, that's a very comparable situation to medicine if you test a medical drug Then you give the drug to some people and you don't give it to other people So it could be that this drug helps some people and doesn't help the others but it could also be that this drug has a risk and That people suffer from taking that drug But we generally have the idea in medicine that if we don't know whether a drug helps Then testing it is an ethical thing because it will help many more people in the future Yeah, then we may wonder how do we reliably measure what's an incident because we may have situations where It's not even clear what was a hack or you have been hacked and you don't know about it and You probably get very different results whether you have someone who is just affected by the normal Everyday internet malware thing or someone who is a targeted someone who is targeted by a professional attacker and Many more. I'm not saying it's easy like there are many problems to be solved and I think the the security applications and antivirus software are just an example Here I think there are many things that we could test with such tests like their debates about the safety of programming languages Is rust better than C++? I think so, but I would like to see studies on it our application security like is browser a more secure than browser B could be tested and Finally, I want to bring up an example which I say is in some sense both a good and a bad example So this was a tweet from the FTC the Federal Trade Commission in the US Where they say encourage your loved ones to change passwords often and some other things so Yeah, so at some point they found out they actually had no scientific evidence for this recommendation to change passwords often And they tried to find out why are we recommending this and then they said okay We're recommending this because we're doing it ourselves. So it must be good and Then they basically recommended the opposite they said okay We we have reconsidered this we looked at the evidence We have some studies that say so that mandatory password changes are not a probably not a good idea And maybe you should not change your passwords on a regular basis so I looked at the studies that they were citing for that and I was not completely convinced So I felt the quality of these studies was not very high So all of them were based on on observational data that mean they didn't make any intervention where they put people in groups But they they did things like they had password data from a company Where at some point they had a password changing policy and at other points not and Then there was one which was trying to make a theoretical model of password breaks and password changes and how much that matters but the the basis of all of these studies was observational data and Then also some of these studies tried to measure things like possible quality by the entropy and if you think about that That's not really what we care about what we care about is whether our data gets hacked We don't care about the entropy of our passwords Maybe the entropy of our passwords is an indicator that we have a good password But it's only one factor and there are other things like if we reuse a password that's also bad so maybe people use a strong password but use it for many different services and that's also bad and In medicine there's a term for that and that's surrogate endpoint Which is when you're measuring something that's not the thing you really care about but maybe an indicator of what you care about And that's generally considered a lower quality of evidence So the good thing here is the FTC found out that they didn't have scientific evidence for their recommendations And they said okay, we have to look at the scientific evidence But the not so good thing is I think the quality of the evidence was not so good So the real conclusion would be maybe we just don't know and we should do some proper studies on that So finally I have some Things where I think like I think this is the right approach, but I think it has some limits that should be considered There's things that where we want to protect ourselves against Threats that we cannot really measure because they may be just future threats for example Currently we have a debate about post quantum cryptography Do we need to protect ourselves in this against quantum computers? There are no quantum computers today So we cannot measure any attacks with quantum computers, but we still may want to prepare for that and There's things where we have attack scenarios that are very obscure where we say okay What if a nation-state is? Compromising a debut and developer and he gives me a different package than the one he is telling me Which is something that the reproducible bills community is trying to tackle Which is it I'm not sure if such an attack ever happened, but it may still be something you want to think about against protecting So that's there there are definitely situations where you cannot do this approach with a controlled study And one more thing there are sometimes claims that are simply against that violate basic scientific principles For example, if a vendor promises full protection from malware, that's just a lie It's simply a lie because that that's impossible Because of the so-called halting problem, which is a very basic theorem of computer science And there's a related debate in medicine where some people argue we shouldn't even study something like homeopathy Because it simply cannot be true based on the laws of physics So yeah, that was my last slide So I think today IT security is often very often not based on scientific evidence We rely on experience we rely on experts or even worse. We may rely on marketing We should have evidence-based IT security, but right now we don't have the science to do that Yeah, thank you. I said we likely don't have time for question But if someone is very quick and runs up to the microphone, then we can take one. Yes microphone three If you do you have me? Yeah, if you go to medical studies It's actually the highest quality that you do double-blinded randomized controlled studies Which means neither the Experimental nor the participant knows whether it takes placebo or the actual medication You think that is something that you can actually implement because people act differently if they know they have the placebo Or if they have the drug which greatly interferes the randomized control Point so so blinding studies if you can do that depends on your situation Because they are situation where you cannot blind if it's something that the user has to actively do but if possible Yeah, blinding is better Okay, sorry unfortunately, we don't have any more time for questions But as I said Hannah will be prepared and ready to ask some questions right here. Thanks