 Thank you, Ian. So I will talk to you about work I did that I did not do alone. I had lots of collaborators among them, Zascha Fahl, who is sick right now, but who will be at the reception later, Simson Garfinkel and Michel Masroek, who's from Maryland and other colleagues, local colleagues, from Germany where I work. And the idea for for this work was that Michel Masroek and Simson Garfinkel had sat down and talked about, you know, we've we've been talking about usable security a lot and has there been effort on making cryptographic APIs more usable and if so, was it successful and they couldn't find a source that had done the work that they were interested in so they decided that we should do it together and then since everybody knows cryptography is hard for developers to implement and in my field usable and security there's been a lot of talking about like maybe these people are just not motivated and maybe if you if you incentivize it better if you if you kind of get evaluated on security results to maybe that motivates people to try harder, but then we have other other things happening where people who are strongly motivated to correctly write cryptographic code or use cryptographic APIs fail. So we were kind of thinking maybe maybe it's not just just not motivation for developers, maybe it's actually a usability problem that even if you want to do something you might you might just not figure out how to do it correctly. And if we look at the usability problem we have various factors that that could actually be looked at. So one thing that people have said would contribute to the usability of cryptographic APIs would be if having failsafe defaults. So whenever people don't specify something it should be kind of a secure default that they fall back on. Then there's a talk about documentation that should give you a clear idea of what you can do and cannot do and for the things that you can do it should provide you maybe examples of how to do it so you don't have to figure it out yourself. Then there's a talk about the complexity and the idea is that the more choices you have the more sources of mistakes you have. So maybe maybe we should simplify APIs. Then also again if you don't have a full feature list that you offer maybe somewhat contrary people might just get frustrated give up or use something else entirely. Then you should provide meaningful failure. That is when something does not work tell people why it did not work and fail in a way that people learn something from it and can quickly go on maybe only using your documentation or maybe only using your message that you're sending and not going on the internet and googling around. And then maybe the abstraction level or even the learnability in a way. How many things do you have to look at until you understand what you can actually do or what you need to do to achieve your goal. So luckily for us there are several APIs designed specifically for usability and we decided to look at some of them. And usability in a way for the developers who develop those seemed to have been mostly simplifying the set of choices and also picking secure defaults and also offering fewer options. So our research question in the study that we did that I'm presenting was do cryptographic APIs designed for usability actually lead to better security? And we evaluated this question by doing an online study with Python developers. In this study we did prime developers that they had to write secure codes. So we suggested to them that they were working on a tool that would allow citizen scientists all over the world to collect data maybe on human trafficking and that when cryptography in this application that they should be writing or in this code base that they were working on failed they would put actual humans at risk. And also we picked the library that they should use for them. So we didn't evaluate which choice they made but we evaluated how well they could work with the library that we picked for them. We used Python as our programming language because it has many developers that we could recruit from because it has bindings for all popular cryptographic APIs that we could think of and because it had those APIs designed especially for usability. We narrowed our choice down to five libraries, which I'll briefly talk about. They had roughly similar sets of features and only some claim to be usable. So if we look at PyCrypto, that's basically the the most popular crypto library on Python. So we use it as a quasi baseline. And then we had M2Crypto, which is one OpenSSL binding. It is not the official one, but the official one doesn't have support for the basic crypto task that we later decided on. So we had to pick this one. Then we used cryptography.io, which claims that they do crypto for humans. They offer somewhat complete documentation and they simply simplify some methods, but they also offer access to the more complicated things you can do if you're knowledgeable. Then we picked Google's Kisar to look at, which is supposed to make cryptography easier and safer for humans, but which did not support the full feature set that we picked. And then in the end we picked PyNecl, which is said to avoid disaster. It's a binding for salt. There's also PySodium, which would also be a binding, which has a slightly larger user base. But it was beta when we did the study and didn't have documentation. So these were the libraries that we picked. The lower three were the ones that were designed with usability in mind. To evaluate them for usability, we had to have people use them. And we did that by asking developers that we recruited to solve certain tasks for us. So we had two sets of tasks that we gave to different people in our study, because otherwise it would have taken them too long and they would also have been able to learn and then kind of in the second set of encryption tasks, they would have maybe been faster. So to one set of people, we gave a symmetric set of tasks where they should encrypt, decrypt, generate keys and store them securely. And this was challenging in the way that they should pick a strong key and that they should store it securely. To other people, we gave the asymmetric set of tasks, which were also encrypting and decrypting and key generation and storage. And also they were supposed to do certificate validation, where they should check the signature and where they should check the hostname. Sadly, we were not able to do all of these tasks with every library, but we thought that these were basic crypto tasks. So we wanted to look at them. And we also thought that when a library did not support something this basic, it should somehow say so, so that people should quickly figure out that this was not doable and just skip to the next task. Solutions could be either secure or insecure. And when we collected the code, we later checked what people actually did. We had in the past tried to do large-scale lab studies, which did not really work out well for us. And also at our local universities, there were no large Python classes. So we decided to do an online study. You can see the online study environment that we have. It has the option to skip a task or to say you're done and move to the next task. You can test and you can also try to get unstuck when you got stuck. We offered people a code skeleton where they should only fill in certain methods, so it wouldn't take too long. And we were able to measure the code that they wrote, the time that they took on the task, whether or not they copied and pasted and where they clicked, and also when they dropped out of the study. For the people who did their tasks all the way through the end or skipped all the way through the end, we gave them an exit questionnaire where we asked them several questions that we felt were important to usability. So we asked them whether or not they felt that they were functionally successful in the tasks that they worked on, and we also asked whether they felt that the solutions they had written were secure. We felt that being able to tell when you've actually solved something securely is associated with and using an API that's usable. Also, we used the SUS, this is the system usability score, where you ask, usually on software systems, you ask questions like would you recommend this to a friend? Rate, whether, or rate how much you agree with a question. I liked using this. It's kind of designed to measure the usability of a system, and it's only somewhat appropriate for APIs, because would you recommend it to a friend? Would you like using it? That's maybe a strong thing to say about this. So from literature that we found, we kind of drew up the things that other people thought were important for having usable APIs, like having good documentation, being able to tell when you've made a mistake, when you've made a mistake, being able to move on quickly, and we asked these questions as well, in a similar way, where they should rate agreement. And then in the end, we asked demographic questions, so we could kind of tell whether or not we were working with experts. We drew these people by emailing Github developers who had committed to Python repositories and invited, randomly invited people. We were not able to offer them compensation, but still some of them participated. Lots of emails also bounced, and very, very few people complained. You can see here that we had 256 valid participants of 1,500 something that started, so lots of people dropped out of the study. Most of the people we had in our study were professionals, and most of them also had no security background. So one of the the diagram you see here kind of compares how active someone is on on Github to how active someone who actually finished the study is on Github. So on average, the dark blue is the people who we send invite emails to, and the light blue is people who actually participate in our study, so you can see that the people who participated were somewhat more active, but maybe not we didn't reach one complete end of the population, but somewhat average, maybe slightly more active than average. So one important metric we thought that had to do with usability was whether or not people who started the study actually would be able to finish. So in the asymmetric condition, compared to the symmetric condition, twice as many people just stopped working on the task at some point. In general, in PyCrypto and Cryptography.io, few people dropped out at some point in the study, and the dropouts were also twice as likely as for those libraries when people had been assigned to M2Crypto or to Kisar. And this could be either incomplete documentation, or it could also be we later thought maybe not being able to tell quickly whether or not a feature is offered and then searching, searching, and at some point just giving up. So if we look at functionality results split up by task, we can see that this varies wildly. It varies wildly by tasks, and it varies wildly by the API that people were restricted to. So you can see that the light blues, can you actually see? Well, the nearly white bar at the top that's that's symmetric key generation and storage, and it moves from there to the dark blue at the bottom that is the tiniest, most of the time, that is certificate validation and the that usually, for most, for most APIs, asymmetric task where the hardest to solve, and certificate validation, the dark blue was the hardest among all. If we look at security, so try to keep this bar chart in mind, we see that it looks vastly different. So here, if I skip back, where for PyCrypto and M2Crypto, you can see that lots of the tasks were somewhat functionally done. It is way less here. So secure key generation and storage was very hard, and no single participant managed to do secure certificate validation in our study. If we look at results by API and no longer split by task, we can see that functionality was was was pretty pretty good for symmetric and PyCrypto, for example, and the worst altogether for keys are especially in the asymmetric encryption, which it doesn't doesn't support well. What we found was that when we detected that people had copied and pasted a code snippet into the result, they had a three times the chance of getting a functionally working solution, as people who did not find a snippet to copy and paste or maybe wrote code from scratch. Now if we look at security, suddenly keys are that has the least functional results, has the most secure results out of those that were functional, because we did not evaluate whether or not something was secure that was not solved. So you can see here that this is kind of kind of weird and hard now, because what do we want? So the ideal solution, of course, is that everybody is able to solve each task, and it should also be solved securely, which none of these really achieved. But then what is what is the next best? Solving a crypto task in a way that it actually kind of does something and encrypts in some way, then having it work insecurely so it can be cracked, or would you rather have something that can just not be solved at all? I'm not going to answer this. I'm just saying we have different results all over the field here. So if we look at what differentiated things from working and things from not working, the SUS score that we collected helped with functionality. So whenever that was especially bad, which was the case for our participants for Kizar and M2 Crypto, the functionality was also especially bad. But none here was ranked more than mediocre for normal software, but we don't know how this compares to maybe scoring of normal APIs because nobody has really collected a large data set of usability of APIs at all. So it's it's hard to say whether or not they were all really horribly usable, or if it was just that APIs compared to some clicking tool that you might use are just hard to use. And also designing for usability did not actually lead to good perceived usability by our users, as we can see in Kizar being ranked very low in usability. Good documentation could actually differentiate. We got lots of comments on how some bad documentation was not useful, or people dropped out and said that documentation was just not useful, and also the library that did comparably well does offer low-level functionality and also helpful documentation. And it led to the best combination of functionality and security results. And also somehow full functionality is important because in our study when people were not able to tell whether or not they could proceed with a task with the library they had, they just dropped out, and in real life the library will just probably not be used because people just cannot tell whether or not it can do a thing, so it's it's practically useless in many cases. So we were thinking maybe there were different things that the participants brought to the table that maybe differentiated what worked and what did not work, but we couldn't find that their experience in Python and with programming in general did matter in any way. We also found that security background almost mattered to security results, but even people who said that they had a security background, that they had worked in IT security or that they had taken classes on security were not able to produce significantly better results than the others. And now let's look at the scary solutions. That would be the ones for me that work on their surface that are insecure and you can't tell, and that were 20% of the of the insecure solutions were believed to be secure by the participants, and this sadly, scary, did not differ by the API that people used to solve the task. So none was good or especially bad at helping people figure out whether or not their solution was secure. So what did we do? We investigated ways to measure the usability of especially cryptographic APIs in a developer study. We used the SUS, which is used widely for normal software, maybe for end-users too. And even if, even though the questions don't really apply here, the final score that you get by using it somewhat correlates with solving tasks, but not solving them securely. We tried asking diagnostic questions and at least we could tell whether or not people who were not able to use something also thought the documentation was bad. We measured functional task success as maybe as a metric of usability of something, but maybe for usability of something that does crypto as more, so maybe we need to measure secure solutions, or maybe we need to measure the fraction of secure solutions out of all the functional solutions because when someone doesn't do anything at all, they just go away, and then they at least don't write insecure code, or maybe we should measure the usability as the fraction of developers who would rather give up something that they work on than work with a certain library or API. So these are just things that we investigated and they all seem in some way to give meaningful answers about whether or not an API works for the task that it's designed to solve. So I'm kind of closing with our takeaways that implementing crypto is hard, even if you're an expert, even if you're a security expert, and it's not very easy, and it's not enough to design an API for simplicity. Usability is more than just limiting choices, and also even if you design for usability, if you don't test it, maybe you just say you designed something for usability, but it's not actually usable. We give these questions that you can use to try to figure out where your API is usable and where it's not. We also think that a full feature set is very important to have something that is usable because if it only works in a fraction of cases that you might need it for, then people will just use something else. And also that documentation is important for usability and also for security. So there's one meme that one frustrated person left us, I don't know if you can see it here, before they gave up. And this is kind of the thing that stuck with me the most. When you design something for people, documentation is what you use to communicate with them, and if your documentation is bad, you should feel bad and give it more effort, even though it's nobody's favorite task. Thank you. Okay, anyone have questions for Yasmin? At least one of those interface libraries you use here, I used it once and when you went to the actual crypto routines, you had to call a sub-package was called hazardous material. Yeah. So in this case, you're supposed to know what you're doing, but we couldn't do what we would like to do otherwise. What's your position on giving low-level access to people who supposedly know or to put it around or the value answered it also that the high-level APIs are rich enough. So I think the the one you mean is Cryptography.io, which was the one that kind of worked best here in all cases. So the approach seems to be not that bad. They're not the only ones with the approach. I think what maybe made this okay in this case was maybe the documentation. So they were able to provide like, you know, like if you don't know what you're doing, please don't do this. But if you're still doing it, at least here's a code snippet that you can use to maybe do what you maybe shouldn't be doing because you might not know what you're doing. So it's it's probably the better alternative to say, this is not supported, go away, use something else. I would like to know if there are some correlations between the success rate of participant and the times I used to solve the Yeah, the task. So you didn't mention any time consumption of the participant. So they say use like one hour or did they read the whole documentation and after four hours they know all about security and how to do that and solve the task. So people took roughly one hour to finish across conditions. If they finish successfully and we mostly use time measurement to figure out when someone had given up and when we could kind of assign kind of start inviting new people and shut there in sense that they were using down. That's a that's a pretty good question. I think there were no there was no clear distinction in time usage and they were they were closely clustered together, but we should have looked at that. Okay, thank you very much. Are there questions in the top? It's hard to see, wave your hand. Okay, I have a hopefully simple question. Is there any hope for humanity? We hope is here in this room. Okay, thank you as well.