 Herzlich Willkommen, GPN Gulasch Programmierendacht 21, Tag 2, schönen Nachmittag. Draußen ist es warm, deswegen ist der Saal voll, das ist schön. Ihr habt es schon mitbekommen, diese ganzen LLMs und so weiter wollen wir uns eigentlich wahrmachen oder glauben lassen, dass die IT und IT Security eigentlich total langweilig ist und unkreativ ist und dass man das einfach durch irgendwelche Bots ersetzen kann. Das dem nicht so ist und ganz entgegenteil, dass man ganz viel Kreativität braucht. Erzählt uns jetzt dann gleich die ISCA mit Beyond the Checkbox, Breaking out Testing Frameworks. Bitte begrüßt Sie mit einem großen Applaus. So, thank you for the introduction. I think I screwed up the language setting and the submission. No, because it was German until yesterday and then I switched it to English because someone told me, isn't that talking in English, it has an English abstract and everything. Yeah, it's gonna be in English. I hope everyone here is fine with that. Yeah, so I'm ISCA, I work in academia, so I'm sort of a hackademic, so I hack all the things for science. I'm usually not using testing frameworks, but of course many people do, like who offers regularly doing pen tests or something, a few, or playing CTF. Yeah, a few, okay. So some of the target audience are here. All right. Originally I gave this talk at an Ova's Meetup and so it is a bit tailored to the Ova's mobile application security verification standard, but believe me like it applies to basically any testing framework that you find out there. So the takeaways are similar no matter which testing framework you take. The idea of a testing framework is that you have like complete and consistent test results. So when you have any product out there and you test it with this specification or standard, then you would get like all the things that are insecure and can fix them. And the test results are always comparable. So that's the theory. And the good part about this is compliance, right? So you have a security checklist for those complete consistent results. You can check your application and you can say, yes, we are compliant. All checkboxes filled. Yeah, but yeah, reality is a bit different. Reality is more like, yeah, you know, we had this crazy idea and then this happens and the other thing happens and that's real world a bit off. So real world attackers might be, yeah, you know, I let them connect to my free Starbucks Wi-Fi and then I use whatever or I just use my zero day exploit and I will hack you. That's reality. So sophisticated attacks happened and it means also, and I think that's not in the attacker model for most people, but basically if someone can afford buying a zero day to hack you, you are screwed no matter how much of testing you did. So for a long time I was like, yeah, we don't need those testing guides at all. Like why are they there? Just checkboxes, it's all boring. You can always hack into a system with sufficient effort. And especially in academia where I come from, I think they are not so relevant, even though academia should actually contribute to them more, but they are not so relevant most of the time because what you do in academia, you want to create something novel. That means you pick an interesting target that nobody looked into before. You find a new bug class that nobody cared about before or you research something new like a new bug finding method. You write a new father or something. And this is often a very difficult journey, but all of these things, they create novelty. So that's the central point. But because they are novel, you don't know when researching if you will actually find a new bug or if you will be successful. So it's really, really uncertain what happens. So you would have a target that nobody looked into, you will start somewhere and you might make wrong decisions and just will not find any bug ever. And sometimes just minor differences might make you find something in the real world. And I think the most, not so fun part about this is you might spend like half a year on a target looking into it. And meanwhile, someone on Twitter is posting like, yeah, here's a treatable proof of concept, how you break the target and you're like, what? Yeah, really? But this happens. And in this whole process, usually those checklist guides, they won't lead you to novel research. So that's to be said about them. And basically all the things that I did with Bluetooth so my introduction were probably not on a checklist. So the first thing was I was looking into how to hook into Bluetooth communication in a way that you could even look into cryptographic implementation details. Nobody ever tested that before and it turned out, yeah, once you are able to test the Bluetooth protocol with cheap devices, there are bugs that are in Billions of devices, but yeah, no tooling. Or the attack surfaces for wireless were like you only attack Bluetooth, but Wi-Fi is a different thing, but actually you can like in a combo chip from Bluetooth to Wi-Fi get code execution. That's just not in the generic attacker model. So there are things that are outside of those attack surfaces which are of course not covered by any checkbook lists. And the next thing is yes, there are fuzzles out there, but they were incapable of fuzzling Bluetooth stacks. So you just take any Bluetooth implementation, try to put a fuzzle around it, do some new things to make it run better, and suddenly you find bugs. So basically where I started, there were none of these testing guides. Just unanswered questions. I was like, yeah, could this work, could that work? I was also trying things that were unsuccessful, not on this list. And they were all, yeah, outside of this. But what happened then is that a lot of students at the university were like, hey, for my thesis, I don't want to do something that's so uncertain. Like this research is really, really uncertain and it often takes longer than half a year of the thesis. So they were saying like, hey, when I graduate, I want to become a pen tester. Or like I have experience with CTFs. I want to do something of a similar scope for my thesis. And they also said, yeah, Bluetooth, I don't know. It's a weird thing. I want to do like web, mobile applications, IoT, like something that I know. So how do we test this? And suddenly I was back at those testing guides that are out there. So, yeah, some researchers might not need those testing guides, but actually for some scenarios they are pretty good. And one of the students I supervised even contributed a bit to such a testing guide. But, yeah, a bit more on this later. So app testing guides, you can think of, they have a very generic threat model. So, let's say, for mobile applications, you would think of, yeah, what could go wrong in any mobile application. And it doesn't matter if this application is like an online banking app, or if it is a fitness tracker app, or it's just a game, it would just be the same threat model for any mobile application. And from this threat model you then derive like a list of common attacks that could affect such a mobile application. And then people look like, what are tools and tests out there that we could use to find those vulnerabilities. Sometimes there are tools, sometimes you have to do a little bit on your own. And that's how any framework actually works. So no matter if it's a mobile app or a general IoT testing framework or something, you have those generic models of how something works and try to derive attacks and tests. The first thing about it is you would get common issues that should be tested for every app. So if those lists generate findings and those findings are severe, not all of the findings are severe, but if those lists generate severe findings, then you really did something wrong. You shouldn't have such bugs in your app. And it's also beginner-friendly. If you don't know where to start, especially as a student for Jesus or something, it's really nice to have this list telling you, yeah, these are common threads to test for. And also already knowing a list of tools can be very helpful when you start with something. But then there is also a lot of contra about this. So your results will never be complete. So I don't think that any testing framework ensures completeness. There's an overclaim. There's always something that was not covered. And even if you have a very narrow scope, if you say it's only mobile applications, there's so much different tooling to create mobile applications and so many different frameworks, programming languages and everything. So even if you would only test for one type of vulnerability in all those apps, you would need very different tooling and your results might still be inconsistent. So just the way how you create them leads to inconsistency. So I don't think that with any guide you could say we have completeness and we have consistency no matter how hard you try. And also those back classes, they might change over time. So yes, the testing guide frameworks, they might be updated, but not necessarily. So it could just change and not cover it. It's much more important to teach students how do you do research and find bugs for a lifetime, not just checking checklists. So how do you do thread modeling on your own instead of taking one that exists? How do you create your own tooling? So there's really a lot of abtesting guide blind spots and I think the first big one is that you have to do your own thread model, you have to create your own lists what are threads, attacks, what do I want to protect and what do I need to prioritize. This is the next issue. Like I think for a lot of testing frameworks you would generate many, many minor findings that are not necessarily relevant to the app. But at least you would get a long list and that makes the customer happy so to say because there's many things that they can, just minor fixes. And also, and especially in academia there is no out of scope, also for real world attackers, there is no out of scope, but very often for pen testing there is a very narrow scope. And of course you need to build your own tooling. There's also often not the time to build your own tooling and pen testing. So what I just discovered was when you don't have tooling you create tools for something that you would find a lot of bugs. And for the OSP application standard, the security verification standard well they have just common threads and the testing guide says there is the following tooling, use this. So you probably won't find many new things. So if two pen testers would like use the same list and just exactly follow it. Yeah, it's not much new to find but there might still be a lot of oversights. And there is like one thing that this guide in particular does, like it assumes that a mobile app is pretty much a web client. I think that's also an over thing, like everything is basically web in the end. And it's actually not so good, especially if you look into the storage requirements. So one idea in this guide is that when you have anything stored locally on a device that's insecure, like other apps might attack the sandbox of the app or your backup might be somewhere in iCloud or even on your local Mac or whatever, like Google Cloud something where your backup lands in the end. And that could be insecure. So to control the data and secure it it's much better to put the data in the cloud. That's the theory, but the cloud of the vendor in this case. So for example if you have a fitness tracker then the fitness data should go into the cloud of the fitness tracker provider. Or I don't know, I don't know, it's what they say or your personal pictures, private messages everything is more secure when you move it to the cloud and maybe think a bit about what is good, what is not so good. Also for usability I think it's pretty horrible. I don't know who of you has tried backing up their signal messages. Yeah, it's really horrible. Yeah, but backing up your signal messages like every time you migrate to a new device there's a high chance that you lose them. But of course they are very very secure stored on your one mobile phone and really hard to migrate. And I think that's one of the reasons like following such standards rather than thinking like giving the user a possibility to say where the data goes and actually backing it up when they want it. And of course another thing that I personally hate a lot about such standards is that they say hey you should prevent researchers from reverse engineering and tampering with our applications. There are of course some cases where it might be legend, let's say an online banking app maybe should not run on a rooted or jailbroken device because it has to do with your money and sandboxes might be broken. But in most cases those detections are really harmful for security researchers because even when there is a bug bounty if you need to spend many hours or even days to bypass such detections before you can even analyze an application nobody will look into this and even if you have a bug bounty program you won't get any free research or free security testing. So don't think about this being the only security measure as well because yes it takes time to bypass but all of these solutions are very very similar. So a powerful attacker bypasses for reverse engineering frequently it just takes the same script and it will run on basically all the apps there's not so many different solutions out there. I even heard of stories where like pain testers were okay we are testing the app, okay the app has jailbreak detection so we cannot analyze it so it's basically secure like all the other tests kind of passed. Yeah so really think about building this into applications it's not necessarily a security measure also something that I heard might come to one of the future standards but it's not yet in there is user privacy I think the biggest risk of mobile apps is really privacy these days there is almost no mobile app that is not using any advertisement framework they're sending all your data so it's really crazy what is sent if you sniff the traffic from an application there's many many information about you and yes it's kind of pseudomized or something like but it's not really fully anonymous there's so many details about you to track you and to get all the data about you and often it's I mean apps are free so you want free applications but in the end the data about you is still being sent to servers like there's only a few exceptions where application wouldn't try to track as many information as you as they can just for advertisement reasons and now the next thing is like as I said real world attackers you have your app that's the thing that you pentest but it runs on mobile platform for example so here on the right hand side you can roughly the structure of like how modern android looks these days so you have your app but there's system apps that are preinstalled then there's some java framework stuff native libraries an android runtime some hardware abstraction kernel firmware stuff so a lot of things that an application depends on and where you say well it's out of scope right so usually that would be of course out of scope for a normal pentest but for me as a security researcher in the university context I say like actually this is much more interesting than the app itself so if something there breaks that's really of impact for everything so interesting here is for example all the frameworks if they are part of ios or android or even third party frameworks that I used so a lot of applications are interesting so the interesting part here is also sometimes when you do like very good testing of applications you might end up finding bugs in the operating system so one example is a VPN application that one of my students was testing and they were just trying to use a standard VPN application creating VPN profiles all through a standard app and suddenly something in a network stack crashed so that you wouldn't get network connection until you reboot and remove the VPN profile and of course this was through the app but affected the operating system in the end and this can happen pretty quickly and now of course really out of scope for most people but there is of course also firmware attacks so anything that's running in a chip on your device which is getting more and more basically unprotected the reason for this is so you have more and more co-processors on your phones these days for performance reasons but they don't have a lot of security measures so you have security from the 90s in those firmware chips and hope that like some kernel layer between protects you but it's often not the case as we have also seen with the latest publications from Google Project Zero there's this display co-processor in iPhones that has been attacked and they even found a lot of baseband, flaws and so on so firmware is really not secure and for Bluetooth I've also been looking in firmware and there's like if you're lucky maybe even not this there's no ASLR it's really really broken and I mean firmware yeah some people at least look into this and become aware of this I think Android, they started rewriting some firmware in Rust so maybe it's moving forward but then stuff really nobody looks into those lowest layers, the hardware and there is so many issues there like there could be side channels and you could monitor anything like or let's say there is a cryptographic calculation and there might be a side channel to get the key from this or fault injection is a way to actually skip instructions or manipulate memory reads in the CPU so that it would do something different than what was written in the code and with this you can at least theoretically bypass a lot of things I mean those attacks are expensive but at least they are also super super powerful and if I say expensive this is one thing, like many people say I don't care about hardware attacks, it's out of scope but when designing something just think about that it is expensive to do this attack on one device so if someone really breaks whatever is stored on a device and gets all the information from it, knows everything from cryptographic keys about one device this shouldn't break any other devices so this is a key design element to keep in mind here so yeah those testing guides they are really really helpful for identifying common issues but really also think about what you found was security threats relevant are they significant is it something that you should report to developers or is it just confusing does checklists often really generate minor findings that are not worth being fixed depending on the actual threat model so will it improve security when it will be fixed or is it just really really annoying to fix and this also creates a lot of between developers and security testers I think in academia we have those really weird attack surface that are new and hard to explain and people have to fix them for panthesis is more like all those non minor findings that someone has to fix so really communicating what is important to fix is important here and that's it already with my talk questions perfectly on time thank you very much questions please raise your hand I'll come with the mic to you come there have to be some don't be shy I won't bite promise you can also anonymously tell what was your worst pentestry report that you got or a friend of yours got a friend of yours got yeah there in the bag is someone breaking the ice thank you very much yeah since it's also about weird things in panthesis I just say SQL injection in username field yes yes I mean this should be tested that's what I said like if someone finds issues with those very basic things like something is really broken in your app yeah do your students get with defining new bugs or new approaches to finding bugs do they surprise you? yes so there's really like every now and then I would get students who think further than me and who I learned from and that's something I really enjoy about my work as well so see people grow see people coming up with ideas you mentioned out of scope so what are you doing if you found something critical and the vendor just said sorry but this is out of scope we don't want to handle this I mean what do you mean like I mean there is kind of two things either the vendor said it's out of scope it's not about bug bound it's just about reporting so I don't care if it would get money if you fix it then apparently one can publish I mean if it doesn't have to be fixed yeah so do you do this regularly like publish it publishing something that vendors doesn't want to get fixed it's difficult so I think one thing we looked into recently that it's difficult to fix is digital car keys to measure the distance in ultra white band and there is something broken in the spec that would allow some distance shortening depending on how the algorithm is implemented and what checks are there and it's not fully specified how chips do it and we found it works on apple chips as long as one of the answers apple chips and so yes I mean the question is would you publish or would you not publish I think it helped a lot publishing it and letting people know that what the spec says is not necessarily secure early on it's really difficult sometimes you say maybe one shouldn't publish it but most of the time publishing such a thing even if it's a bit difficult to fix or a bit out of scope helps to not have it built into a lot of systems and relying too much on it thank you first of all thank you for the talk that was really cool then regarding side channels would you say today we are in so many layers of abstraction that we can't even make sure anymore a normal android app looks out for side channels sorry what would you say we are on too many layers of abstraction today that a normal android app can't look out anymore for side channel attacks or the like I think there have been even attacks shown to work in browsers so I think it would just like even with the abstraction layers some stuff should be working it really depends so also depends on what kind of side channel like CPU side channel or maybe for me as a wireless person maybe you would have a side channel from traffic patterns or something like knowing if other apps are sending something like there's so many things that you could consider a side channel and also a lot of things happening in the background that you might be able to trigger and measure I think there have been also papers for mobile side channels but not so many people looked into it that's true any further questions last chance going once if I understood correctly it's not really useful to use the standardized tests in academia but do you think it's still worthwhile to use them in commercial pen testing I think so yes what I would do in commercial pen testing is nonetheless try to make some threat model and compare so it's I would say they are good to start with but not good to end with so just thinking a bit further really helps anyone else last chance well if there are no further questions I would say thank you very much Iska for the wonderful talk and please give a very warm round of applause for Iska, thank you