 everyone. Okay, I think it's probably about time to get started. This is build usable tools. It is a workshop on learning to do security evaluations and user testing. If you don't want to be doing a workshop and you thought I'd just be lecturing the whole time, I won't be upset if you run away or hide in the back. So with that out of the way, what are we going to be doing here today? Well, like I said, we're going to be getting hands-on experience running user tests on each other and trying to evaluate software without users in place. The goal is that these are things that will help you not necessarily even if you're going to be the ones running the test, but things to keep in mind when you're designing software, when you're building software, things that anything that a real person might use, things that you might want to keep in mind when you're designing for them. Okay, so what are the goals of user testing? Roughly, it's that you want to understand what is going on with your application broadly. Like, what is the mental model that people have? How do they think that they're interacting with your software? And how is that the same or different from how you think that they're going to interact with your software? You also want to look for problems that you could avoid. So if you can find something that you know for a fact is going to be a pain point, how to get past that and get around it. And also you want to be able to evaluate the usability of the interface, which means how well people are physically able to interact with it. Okay, let's go over some principles of user testing for when you're actually trying to run a user test. The number one thing to keep in mind is that testing is a distressing experience for the person. From your perspective, you have this nice person who is kindly helping you out, but from their perspective, it's like being in grade school again. It's as if like they're being tested on how well they can use this, and they really want to do a great job and try to do the best. And this is actually at odds with what you want to get from the experience. You want to get at the average experience and you don't want them to be freaked out. So something to keep in mind is that you always want to be trying to keep them calm and help them understand that you're not testing them. You're testing the product and they're just helping. Okay, as for some practical advice of what you can do beforehand, it's mostly up here. Definitely consent forms, super much a thing. That's especially if your app or whatever you're building might have something in it that might have particular types of content, they're there for you, they're there to help you warn them about what's going on. And there's, there's this classic thread in psychology testing of like, oh, we'll only tell them about what this was about once it's completely done and we'll trick them and we'll get better results that way. No, it's not worth it. You're you don't get any better results from doing that. And it's just a cruel thing to be doing. Also, you want to explain therefore you want to explain the goal in a way that they'll understand it. They may not be a technical person. In fact, it's often better if they're not depending on who your users are going to be. So try to explain to them what it is that you're doing and why you're doing it. Often you'll get users from just posting a sign saying $25 an hour if you come here and play with our app for 10 minutes or whatever. I guess an hour. So explaining to this audience who might be anyone what is going on is super important. Also is the what the fancy way of saying it is called the demand characteristic. This is you want your app to be working and they realize this. So when you're having a conversation with someone, it's pretty likely that they know what answer you're going for, right? Like, you know, it's like, hey, do you think this is a good time to go over to the pizza tent? There's a correct answer to that. And the person you're asking it to knows what that answer is. So you have to try to keep as straight a face as possible and not let them know what your goals are because your goals are to find the flaws. And if they're in a yes, a yes, yes mode. So saying, yes, I do think this is great. You're not going to get the value out of it that you would possibly be getting the last thing to do beforehand is make sure to answer any questions that you have. It's very much about creating an environment where they will be comfortable as if there's they don't have someone looking over their shoulder. And then during the actual testing, something to keep in mind is that straight face thing. So don't let them know if they're doing well, don't let them know if they're doing wrong. The like the horse, there was a classic famous horse that people thought had learned to do math, but had actually just really learned to interpret the owner of the horse. So the way this would work is the person would go to the horse and say, Hey, do you know what two plus two is? And then like, the horse would start tapping on the ground. And when the horse did four, he would get really excited. And then when the horse is five, you'll be like, Yes. And the horse realized this. So the horse didn't actually know how to do math. The horse was just reading those cues that the person running the experiment doesn't necessarily even know that they're giving to the person they're asking. So make sure to try to keep that down as much as possible. And never act like they've done something wrong. That's, that's just, that's the worst. Okay, and then after any more questions that they might have, tell them about what the goal was, anything that you might have had to keep from them before you can tell them after it's kind to do so, they won't go tell other people. And then when you're writing about the results of the experiment, it matters what language you use, like, companies have customers, apps might have users, or they might have people that they're protecting. But what word you use, if you use consumer, you're already saying something about the expectations and you want to try to keep it objective. I like participant, it's pretty good. Subject means like, as if you were doing experiments on them. So that's another thing to be careful about when you're communicating your results to other people. And in general, you don't want to waste people's time. That means if like, when you're running a demo, you know, the demo effect, right, you go up here and it doesn't work, don't have that happen with your user test, run it with someone on your team beforehand just to iron out any kinks in the test itself. And have everything ready before the users show up. Like I said, make everything comfortable, allow time for breaks, have good pacing and the things I was mentioning before about not stressing them out. And once again, I've said this before, but just make sure to emphasize that it's the interface or technology that you're testing and not the person. Okay, so things to remember are just generally is use good judgment here. It's not like, it's not who can be the meanest to people who can trick them into things. They're volunteers and you should be careful about that. Treat them with respect and ask if you're not sure. Okay, so that's if you're running a user test, but if you don't actually have users available, you might want to run something that's called a heuristic evaluation, which is a fancy way of saying that we're going to pick several things that are known to be problems and see if we could find any of them beforehand, which has the obvious benefits of you don't need a person there, it's easier, it's faster and you could also keep them in mind when you're designing things as things to avoid. It's fast and it can be cheaper. So what are some examples of this? Well, actually first let's talk about how you do before the examples. So this graph over here I think is really cool. It's a distribution of the problems that different people will find and why you need to have multiple users. So this is someone ran a study to see how good they are at running studies. Amazing, right? So there's a classes of usability problems, there's really hard problems, there's really easy ones to find, and there's evaluators. So an individual evaluator might be more or less successful at finding problems, but even a so even a relatively unsuccessful evaluator might find those harder problems because there's that element of randomness to it. So you can't just have a single person because they're not going to find anything. Having around three to five is generally a good rule of thumb number, but obviously it depends based on scale. And so what does the process look like for this? You can have someone who's responsible for being an evaluator. This is not a user, this is generally someone on the team. They will go through the list of heuristics and they will look for violations and try to fix those problems and you could do it iteratively as well. Okay what are these heuristics that I'm talking about? The first heuristic, he called him H1, he was trying to be very formal about it, it's kind of cute to be honest. Visibility of system data. This is the old Gmail interface. It's the classic loading bar example, but trying to have something like that be actually visible to people. So the user should know what's happening underneath to the extent that it's possible. So just having a spinning thing is not as great as knowing that there's what percent loaded you are. Whether or not that's accurate obviously is not necessarily accurate but. Another thing is matching the system and the real world. This is something that is, goes in and out of vogue, it's that skeuomorphism where you make things look more or less like their physical world counterparts. But when there's a match of the world that people are used to experiencing, which honestly at this point can even include existing modes of interaction, like the save icon, people at this point know what that means even if they have never seen a floppy disk. So matching the user's expectations to how you're showing your system really helps the user get those metaphors in their mind. So for this example here, you should not, this is an example of a violation, you know where it comes from is that this is an old Mac interface, I don't know if they still do this actually, but there's a trash can and you could put files in the trash can which makes sense, but if you're putting a transfer disk in the trash can to eject it, that's a mismatch between the world that the user understands and what they're seeing on the screen and it's better to avoid those if possible. Another one is user control and freedom, if the user is not actually presented with a meaningful choice, why is there a choice there at all? These are, so if these are again from back in the day before it's improved now, but when you wanted to, when you got new SIF, new software and you had to restart your computer right now, your two options are either shut down or restart, there's no do it later, there's no install it next time and the user doesn't have control over what's going on, which is something that you should definitely try to avoid. The other one here is the same thing Windows, did the same thing as well, I have no idea what Windows is doing these days. There's a saying that goes when all else fails, standardize. If you really can't match it to the system, you can't match it to any expectations or display it in a clear way, just stick to standards that people know. So if you're being consistent about something, users see it more often and even if it's something that they have to learn, they will learn it faster if it's more consistent. Another thing is error prevention. It's users are so good at ignoring dialogue boxes and that's not to say, oh man, we should make dialogue boxes more annoying so the user notices them more. No, instead it's better to try to remove those interactions that you were just going to have them click yes on anyway. For like 90% of users, they're just going to click next, next, next. No, 90 is definitely underestimating it and having a, when you actually have a meaningful choice that they have to make, making it clear what the choice is and expressing that to them. Instead of, like this one over here could have been said originally, are you sure you, do you want to save this as a PDF? Otherwise it'll be .do and you can click accept or cancel. But if you have the two different options precisely demonstrating what the choices are, that's easier for the user to understand. Recognition rather than recall. We as much as the locked boxes of oh you only have these five options and narrowing down of choices is not the best for freedom. A terminal obviously has more things that you could possibly do. But for a user that doesn't spend 100% of their time learning and loving a system, it's a lot easier to recognize something rather than have to produce it. So if you wanted to know the time, you could do that in the terminal, but you would have to know how to do that off the top of your head. Whereas if you have a series of icons, you can choose the clock. This is a classic test taking technique. It's why multiple choice is easier to do than a long form question. It's less mental taxation to choose from a list of existing options. Obviously that's not correct for all situations. But to the extent possible, if you're trying to make it easier for the user to do a specific task, showing them that they might be able to do it is a nice way of doing things. Then there's also flexibility and efficiency of use, which so this one is, I think this is actually a great example of it here. Instead of like just giving a little bit of information, it makes it more efficient to use the system by showing the temperature of what it will be during that time, which is a small little detail, but that helps people make decisions in the moment. So making it easier for them to know whether or not they'll be going to this event, which is what they have to answer at this screen here is made easier by saying, Oh, it will be raining. You sure you don't you sure you want to go to the outdoor hacker camp, for example? Another thing is minimalist design, which is possibly past its heyday at this point. But keeping things cleaner makes it easier for users to use them. So this was an example of people where they had to click to do things in different applications. And when people could click things that read left to right top to bottom, which is the one on the left, it's they were able to do it faster than when they had to go around to different places and find elements on the screen in different places. It makes it if the the way you're showing it, the aesthetics of it are clean and simple. It does make a difference in how well that they're able to interact with it. Okay, and then there's the back button one, which is that users make errors. It's going to happen. You shouldn't punish them for it, but you should allow them to recover from it. Back button undo buttons. The being able to put it in the trash and then for the next like 30 seconds, bring it back, being able to unsend an email. These are radical developments in the usability of software that everything should really strive to have as much as possible. Otherwise, you're not letting them fix anything if you just say like, Oh, you your thing is incorrect. Great, awesome. Well, there's nothing you could do about it now, you're already done. And then if all else fails, make sure there's good documentation and make sure the documentation is easily accessible and searchable. This one here that I'm demonstrating is on a Mac when you go into help and you start typing. It highlights where it is in the options so that the next time you go, you don't have to go through the help and you'll know, Oh, wow, I could have just opened down that menu item right there and gotten to it directly. And things like that that make it easier rather than having to read a 30 page document before you could use it for the first time is a goal to strive for. But having the documentation be there is a backup. OK, so time for the first exercise. We're going to take about 10 minutes. Yeah, I think yeah, OK, we're going to take about 10 minutes. If people want to go into small groups of roughly how many people are here, maybe however big you want to two to four is probably best. And then pick some sort of interface that you regularly interact with and evaluate it by the list of heuristics, which is on the handouts that I have given around. So maybe pick find someone with a handout near you and join on to their group. And if you need any suggestions, I have a couple here. Some of them are cleaner than others. But just try to get a sense of what it means to apply these heuristics. And if you have any questions, I'll be up here. Start wrapping up your final observations. OK, time is up for that activity, everyone. Thank you all for playing along. OK, so would anyone like to talk about something that they discussed, something that they thought was interesting and maybe an interface that is a bit out there that you enjoy evaluating? Any cool findings? Yeah, in the back there. Hello. Hello. Yeah. So yeah, where is it? Like following these points, there is one the consistency and standards. So we don't have like usually the back button to go to the previous website, to the previous wiki. Like the links are not giving you a clue that it's a link. So it's just text and it's like, oh, is it clickable or not? And then we realize other things like, for example, the flexibility and efficiency of use that is a bit annoying that you enter to the website and then you have to go to the day and then you have to scroll to the time and it would be nicer that somehow suggest you, oh, what is going on right now? Or directly leads you there somehow? Something else? Well, there are many things. Yeah, great. Thank you. Why don't we let someone else maybe talk? That was awesome. Thanks. Oh, you. Yeah, we also looked at a point where, for example, we are scanning here some servers for some problems. And we were discussing if there should be an indication how long will the scanning take, how much time will it have to take? And there's nice information here when the scan was started. But actually, it could be also nice how things are progressing to show that even this, if something is progressing in the background, what you don't get from the great moment. But it looks very good. And the question behind it was, is it also use case related? What kind of requirements you put on your design? Because some users are more experienced. For example, administrator look at this and say, okay, scan was started two hours ago. It might be finished in the next 10 minutes or so. But other users do not have this knowledge. And so you have to show them this information somehow. Yeah, it's definitely relevant, especially when you're thinking about what metaphors people have ready. If someone comes in with different expectations and knowing what your user expects and building a profile of your user, which is actually something that we could talk about for another couple hours here. But yeah, definitely getting a sense of where users are coming from and doing all of this with that in mind is definitely the right way. Yeah. Okay. We evaluated the mud email program and considered error prevention recognition rather than recall and flexibility and efficiency of use. So starting with error prevention, we found out that according to our test subject, Matt is terrible at data loss, preventing data loss. It's unknown if it prevents clutter. It has no prevention of misinterpretation of no. So there is no chance of misinterpretation. That's the way I remember it anyway. It has no bad input prevention. And on the other hand, it doesn't have prevent unnecessary constraints. So that's great. The interface is mostly requiring a lot of recall because it's all one letter commands. But it has some, it shows some shortcuts in the top. So if you're totally lost, you at least you know how to get the help screen. So documentation and help. Yeah. So there's that's the next point of flexibility and efficiency of use. So it has a really unlimited amount of accelerators. More than anyone could possibly remember. Yeah, that's as far as we got. Great. That's awesome. Thank you. Yeah, that's a tough one to evaluate. I think maybe we're moving on for this one, but we're going to do another one and hopefully we'll have time for people to report back from that one as well. Okay, great. So okay. So before we were doing heuristic evaluation, which is taking things that we expect might be problem areas, but a user study can give you information about the things that aren't on that sheet that you might not think to look for. So the best thing going into a user study is decide what it is that you're trying to figure out. So are you trying to identify problems? Are you trying to get use cases? Are you trying to identify anecdotes? And you want to try to make some guesses about how you think, like a science experiment, right? How you think people will use it and see if your data, which is how well they use your software, matches or goes against those claims. You want to use open-ended questions, but actually, I think in the interest of time, I'm just going to skip to the activity. There's lots of different types of questions that you can use. Maybe I'll leave this up actually, but you would generally ask these after the event, after you finish going through the study. But first, you want to take people through it and have them try to use the software in a way that's generally premeditated. Like you might pick some actions that they're going to try to do. You might try to just throw it at them and say, hey, what do you think this does? How would you use it? And you can also give them a bit of a background saying, hey, you are in this particular position. You are trying to email your boss and you want to attach a file and have them try to do that task is the way that a user study would go. Okay, let's get past this one. And so after you do the user study, you generally want to give them a survey and you can ask them these questions. It's called a Likert scale. If you've probably seen this before, it's like one through five, seven, whichever you want to do either works and ask if they agree. Oh, I agree. This was easy to use or I agree. I was able to do this. But these numbers aren't actually that useful. What they are useful for is being immediately followed by an open ended question, which is why did you find it easy to use? Okay, great. What about it? Did you find easy to use? Oh, you found it hard to attach a file? Well, why did you do that? What was difficult about it? It gets people in the mindset of the easy question of thinking, was it good? And then the harder one that's actually more useful for you, of why is that? Okay, so we're going to take five minutes here. And if, okay, can I maybe get a quick show of hands if who here has a interface that they've built that they'd be interested in getting user feedback on that they brought here? One, one. And don't be shy. It's people won't bite, probably. Two, two, three. Yeah, great, great. Okay. Oh, four over there. Nice. Yeah. Uh huh. I like that one. So maybe we can try it if you saw those four people, grab some people around those people and otherwise you can pick from the list that I had before or just pick any user interface. And what you're going to spend the next maybe four minutes doing is designing a study. So this, if it were for the case of email, it would be picking some actions that you want them to do. I've passed around a sample user study script. You don't have to use that precisely. But in a real user study, you'd want to create something like that to be consistent. But the general idea here is try to think of what you think might be pain points, what your expectations are, and what you want to learn about how they interact with your software and how you think that they may or may not be able to accomplish tasks and what those tasks might be. So maybe come up with a couple tasks and some questions, either in the same group says before or if the people who raise their hands maybe want to grab a couple extra people if you want to evaluate something more real and exciting. That's up to you. Okay, let's take four minutes to do that and I'll be around for questions. If you press this button it unmute and then you can talk. It's fine. Great. Thank you. Good dog though. Much better. Yeah, one minute. Okay, I hope you all designed really exciting studies because now we're going to run them on each other. Woo-hoo. Okay, so let's see. That time is a lie because that's not how much time we have. Maybe we'll take five to ten minutes seeing how long it takes and you're okay. You've all designed this study. Maybe pick one or two people to actually now go through this study. You probably have people in your group who know the interface less well and more well better. Yeah, anyway, so you're going to go and you're going to read the script, have them do the activity and if you want to switch off and go back and forth, especially go around and give different people the chance to run the study and see what it's like trying to keep that straight face and not give feedback is something that's really helpful to practice. Questions? Yes. If they start to discuss what they're doing with you, just say, just straight. Yeah, great. Okay, one minute, try to wrap up. Okay, let's come back together. Uh, great. Okay, let's see how many groups we have. We have like one, two on that side, and then smaller groups on this side. Great. Okay, so I would love to hear from you guys how it went, what some of the tasks that you had in mind were, what some of the questions that you were hoping to figure out were and what either went well or didn't, what might have surprised you about how the user interacted with your software and also what it felt like to be running it. Like, were you surprised by anything? Was there anything that was confusing or what did you learn from doing this that you might not have realized previously? I've got a mic. You don't have to answer all of those questions, just some of them. Even just what did you study? What questions did you ask over the tasks? Well, yeah, we have here a system and it's used to automate formability scanning. So the task was add a system, add a server to the, to the program. And one of the fields was location. I designed the software and in my head, location was like an IP address or a hostname or something like that. But then the the user filled in local site SHA-2070. So I should change that or make it me more clear. Nice. And did you expect beforehand? Were you going into this thinking, oh, man, I want to make sure that I know that the user understands what location means or was that a surprise to you as you ran it? That was a surprise to me. Great. That's awesome. Thank you. Do you want to talk about what it was like being on the other side of the user test? Yes, I can. As a first another question was in this in this program, you have networks and you have systems so you can enter networks and you can enter systems. Systems is a server and network is a network and network consists of systems. So when you enter a system, there's a button at networks. So OK, I say my system is placed and I hope it's possible in two networks. And when I go back to networks, I also can enter systems for the network. So I have a bidirectional connection and I assume when you have a kind of end to end connection here, then and you have a lot of systems and a lot of networks. It's very good to go in one direction. So you come from the top and go to the button also to the subcategories and not do not allow the other way. So your mental model of the system didn't quite map on to how. Yes. But then we were discussing that in some cases it might be useful to allow both directions now because they are somehow complicated. Interconnected so it made a little confusing at first, but it's something that could be useful and feel like hearing that out. Oh, yeah, the dog's back. Great. Yeah. Yeah. OK, thank you both very much. Who else? What are the table right behind you guys? I think you maybe had an interface that you were trying to evaluate. Yeah. Yeah, well, so I was showing first an app I've been doing for quite a long time. And so basically it's an app where you can code your phone within your phone. Basically, we we gave the task of well, in fact, we didn't have a task at the beginning. It was a bit chaotic. We were exploring the interface and then we had a task of run project that is included in within the app. And then we found it like it was a bit confusing, like not really engaging the user to explore more or. Maybe the icons were misplaced. I don't know. I think it has a lot of hidden complexity that I found it really difficult to communicate to the user without like gigantic manual and you know, compress everything into a really small interface is a bit complicated. I don't know. Maybe you you can explain something that you find out. Well, my task was to open one of the projects, which I think puts like, well, it's running some JavaScript code. And then so I went into the menu of projects that they show in the sort of the homepage of the app and I clicked on one of them, which seems OK. But the result was it said source missing or something like that. Yeah, I don't remember. There was just something that just didn't work. So then I went back and then look for another one. Like I wasn't getting sort of feedback quick enough to engage more time with the app because like I was trying to press the different projects and I was trying to see, I don't know, I guess I was just to like, OK, if you're there, you want to see some results quickly so you engage and then you get curious to explore it further and somehow. Yeah, exactly. And then the menu was maybe closer. OK. The menu was in alphabetical order. So the first thing that came on the top was advanced. And I guess that was putting just things that were not what you're looking for when you're first engaging with the app. Like I wanted more like a quick start, see something, see if the app is interesting for me. And if it's if that's the case, then I will go and expert further. Yeah, it's a great point. So don't throw all the information at the user in first at first and instead give them just a little bit, which they can then expand from as they learn more. Yeah, yeah, that's a classic having a easy mode versus advanced mode sort of things like the right click and go into developer tools, then you find out everything else that you can do. That's a great way of organizing things. Yeah. And and there also was an was an interesting detail then because it's a newer Android version. I I actually had had trouble with the system UI around the app. So yeah, exactly, exactly. But but it was how do I get back out of the screen because OK, on your Android, you have to swipe up to get the back button whereas my phone is sold. It still has a hardware button for that. And I was pressing all over the phone to find the hardware button. Wow, that's a great story, actually. Thank you. That's maybe we could tell the Android people about that one. Yeah. OK, well, actually, so that is time is up for us. Thank you all for coming. I hope that you took away some new skills. If you have any questions, I will be here for the next eight minutes before I'm here for the talk right after, which is going to be cert bot and let's encrypt office hours. So if you were evaluating cert bot and you found some problems, come and talk to me about those also. And you're free to take the sheets home with you. All the information is available online. You can also just Google Nielsen usability here sticks if you don't want a physical sheet of paper. And otherwise, thank you all for coming.