 Thank you very much everyone for coming. We will just wait a few minutes while they reboot whatever the hell it is they've just done So this talk came out of my perspective of the industry in the last 20-25 years of software testing Some of you may know I've worked in global roles in companies such as Google and eBay and looked at software engineering and software quality and how testing fits in that What I've found is that can you change the Gain is a feedback So what we've found is that there's very little to guide software testers Some of you have taken certification and maybe you think that helps some of you believe your exploratory testers Maybe that helps. Maybe it's test automation But we don't actually know there's very little evidence in the industry And what I realized it would be really great if we could look at what automation is starting to do better than we're doing and What humans can do better to improve our work, especially given the increase in automation in software testing Many of my examples will come from mobile because I think it has some more interesting challenges But the concepts and the principles apply throughout software If you disbelieve just put your hand up and challenge me during the talk I'll try and leave some time for questions at the end If I speak too quickly wave scream shout cry moan something that tells me I'm going at the wrong pace But let's start with a challenge for you When we're talking about software testing, what are we trying to achieve? Is it simply want to do a little bit better than today? We know it's not very good. Just make it a little bit better Please do a bit of testing sort of can you test this 20 minutes? Tell me what you think fairly informal Are we aiming for perfection? No bugs in our organization. We're going to automate the hell out of everything Are we looking for good enough? Whatever good enough means for us and Typically, do we limit ourselves to what we know rather than learning techniques beyond what we know? And do we believe in our little bubble that this is the world that we know it in terms of software testing? The choices we make the answers to these questions materially affect what we do in terms of testing So let's look then at machines versus human beings Across the industry. Hello, of course you can they're recording anyway Okay, and I keep going while you're fapping around Thank you. Right. So automated testing is getting much much stronger and there are businesses growing up around automated testing This slide when I drew it was back in October last year And there's a company in the USA called app diff and what they do is they take apps And I'll tell you more about that in a minute But earlier this week I had a chat with an Indian company called apache and they're doing something very similar So we can see one company two companies start to be many companies will be doing this sort of testing There's no humans involved here This is simply Computers and what they do is they download lots and lots of apps from currently the Android app store because it's practical to do that They take your app and they start testing it. They query the structure of the application So that many of us we're familiar with the concept of a DOM on a web page So the web page is a structural hierarchical Object we can interact with and say fine with the first text box. We can do the same with every mobile app So we can query it we can get information in Android. It's called the hierarchy view So what they do is they look at the structure of each page and they effectively navigate through the application They do through thousands and thousands and thousands and thousands of applications So these are the thousands that's one two three four five. This is the app. We care about This is today's release So it's the one that's in the app store version 2.98 or whatever it it is. This was the last one This was the one before with me so far good so They've tested this one. They've tested this one and this one's now there What they do is they compare every single screen and they say has something changed in the screen But we can do that They could probably do it a little bit better a little bit faster with computers and notice a little one pixel difference They don't worry about whether it's right or wrong What they tell you is this one's become bigger and more complex And as you probably know working with the doms one of the challenges is as it gets more complex It takes more compute power to destroy it on the screen If you're using certain techniques to display your screen contents in Android like a list view it takes lots of resources They could tell you this they also tell you the differences between the previous one So you'd say this little page now seems to be slower bigger fatter than the previous one Maybe it was by design Maybe we wanted to add the feature, but this just tells us that And they give us timing data and they can run it on different devices So we get different timing data for the slower devices and the faster devices again Maybe we care about this. Maybe it matters to us our app takes five seconds on a Samsung Galaxy This is the comparison with our competitors So if we're a shopping cart in India you have several in the US They have Amazon etc ebay they can compare against your peers and say by the way You're a little bit slower than they are on the checkout process They also have a little bit intelligence says if you have a Facebook login or a Google login They've got a series of accounts they can use for this testing So essentially this service is free So you sort of think okay, so why are they doing all this for free? Well, how many of you have released an app? You agree that with a couple of million users Okay, a couple of million users So we got someone here who's got an app with a couple of million users So you put the app in the app store and at 5 a.m. Whatever time of day it goes It's available to users. It's automatically updated on people's devices. Isn't it pretty much? You realize about two hours later something's going wrong Really wrong They test this once a week Would it be nice if it tested it beforehand? That's what you pay for to get the information beforehand not afterwards The next thing and this comes from academic work And this is one example of many but it's one I read about fairly recently in the paper was published last year So it's about 12 to 18 months old work We have an application the application crashes Many of us who have mobile applications have something that automatically records the crashes if you release through Google Play It records the crashes and provides it back to developers free of charge So crash big long stack trace blah blah blah That comes here What they've done is they say if we notice a crash in this application We're going to send a command to a subset of devices that are running this application and say please record The user interactions on the screen for this application Make sense The expectation is it'll crash again When it crashes, they've now got not just the crash, but what happened beforehand They look at that they generate automated tests and the cute bit is with your permission When you're asleep, which all of us do from time to time they run the tests on the application again overnight Well, we're resting When they're running it, they expect on the same device if they've created a good automated test guess what? We expect to crash again. Don't we if it's related to the inputs They run different devices because typically when we test how many devices you've got the chat with a couple of million users Is a bunch 10 50 a thousand? Okay, so he's got tens of devices troubles your users have got maybe 25,000 different device models So what this does is it allows us to work out where the limit is these all crash These don't crash. What's the difference between these? These is it screen size? Is it version of the operating system? Is it this one's got GPS and the other one hasn't is it the chipset? You can learn all this by machine learning You don't need us to sit down and do this on Device number 17. Oh, you've got 18 devices hang on a sec. Let's do the next set of regression tests another hour later So we need to find ways to do this Now there are similar concepts that are shipping commercially called heat mapping software at least what I know of called app See they do the same sort of thing. They essentially capture the screens every second roughly They create a slow-speed video that goes nowhere unless a crash occurs Then if the crash occurs, we give you the last sort of trace of what the user did we can watch the video Now I get it now. I understand what went wrong Key at this point before I forget to mention this is privacy What's on our mobile devices tends to be personal some applications So if you're doing banking online Having something that quite glibly debits $50 or a thousand rupees every 10 seconds because it's testing to find a bug Isn't gonna wear well with me. I'm not gonna be happy with that But for some applications, this is a very viable model, especially with the permission of the users for free Yes on which bit, sorry Radic crashing. Yes. Well noticed We get all excited when we find a crash or a bug and Sometimes we believe it's because of our brilliant power. I typed in Julian asterisk star bunny character Press the submit bunny goes back So it's because we assume it's because of what we did that the crash occurred a Lot of the work that's being done makes a bunch of similar assumptions that the inputs matter But what happens I give you an example if the difference was a response from the server not what the user typed So I'll come back to that point, but well noticed Next thing is something called monkey testing Now this comes from the concept that if monkeys just dumbly interact with the screens of our device Maybe we'll find something and monkey testing is available for the main platforms for Android's been built in for a long time And I'm doing a PhD at the moment part-time in this sort of topic When you see researchers they always compare themselves with monkey testing because they think they're super-duper Double PhD level research is going to get better results and monkey testing often. It doesn't often monkey testing is actually pretty good so good But this is the Google developer console and this is one of the apps I work on which is called QX is part of Wikipedia project If you're interested allows you to read Wikipedia offline on your smartphones and stuff And we had released 30 31 32 and 33 going out there and these are the internal numbers and that's how Android works and you have a public version number This is nice of the green bar. And what it says is it tested 11 different devices tested our up on 11 different devices In this case of 32 we can see that seven devices had issues We had four that didn't have issues and we can see here if we click on that We get a screenshot and it says it's hard to read but says unfortunately QX has stopped Okay, that dreaded okay button, which means you can't do much else And very nice. They've given me a video. I press that little thing. I've got 15 second video I can watch So I press that watch it and see what happened. It shows me the application starting up Something happened. Hmm. I wonder what that was So let's compare that then This is a passing monkey test and this is a failing one Couple of differences. We can still see the application screen. This is just a list of the different Wikipedia databases You can see a keyboard here. So the monkey is stopped at this moment. It's put in thousands of events We can see a video running for five minutes and 15 seconds. It's typically just over 300 seconds It runs for this one. We can see it was 15 seconds and we've got the video to watch We can see the differences in the device. It's the Nexus 5. This is the Motorola G first edition This one was set to Hindi this one was set to English. We can see there's some differences. So far, can't we? Let's go back to the list of failures Anyone think they can see a pattern here about the fails versus the passes Android version well done. So the Android version when it's 4.4 seems to relate to the failing activity We move along here and we can see well three of them have got what seemed to be the same error message An Android has this way of laying out the screens in something called XML and it's rendered at runtime So it does something called inflating the XML to make the screen as we see it and love it And this but is button. So what it's doing is it's displaying a button Now Android has been able to display a button since the beginning of time So what's the big deal? Fibrous didn't work very well on four-point something devices It actually had a number of bugs and if you went on to stack overflow and read the bug reports You see lots and lots of complaints saying if you use this library on four-point blah, it causes problems Let me go back a minute. Remember I said we've released our application. We're doing updates We tested this application, but we haven't found this bug We haven't got round to thinking oh, let's remember to run it on an older device Stupid us, but thankfully they caught it for us On the other hand Android is and the the test run isn't perfect because they're all the same bug All of them crashed exactly the same point, but the reporting's wrong So once we saw this we were able then with developers say hey, it seems to be something to do with Android He went and had a quick look on the stack overflow site Damn, okay. We changed the little way. We did the layout fix the bug updated in the app store Next day we had a new release out there. So sadly it affected some users We actually got some good correlation from reviews written by users a few users have downloaded it It said hey it crashes on my device when I press the settings button and that's what was happening The monkey was pressing the settings button. We haven't bothered doing that in our testing Next thing and it is coming on to your point Most of you heard a cold at company called Microsoft Yep, good. Do that Microsoft still make mobile phones and an operating system and it has a few users a few million users They own their own app store like Google effective the owns the Android Google Play Store and Apple with their store Something fairly clever. They looked at every crash in a year 25 million or so of them and they went through the crash and said is there anything in common with these crashes and What they decided and I'll show you more about this in a minute on the next is they essentially analyze all the crashes and Looked at ways they could reproduce these crashes using automated testing The paper is a very good paper It's freely available if you search for how to smash the next billion mobile app bugs That'll get you to the right paper and before I forget the slides are available for you and also The screen isn't available. Never mind Something weird and wonderful happened there My computer is semi-happy It looks like it thinks the HMI is gone or something from the HMI projection Okay, I'll keep talking and hopefully we'll come back to the slide in a minute Okay, we're rebooting the projector what they found is roughly 90% of the bugs Came down to 10 common problems I'm going to ask you whether you know what an HTTP response code is Anyone know these things? Can you give me an example of an HTTP response code? 404 good anyone else 500 anything else 200 right, so what do app developers do when they write software? They assume that when they make a network request to get their clever Jason stuff It's gonna come back and not have an error So they assume this will always happen a hundred percent of the time, but guess what sometimes it doesn't happen Sometimes something goes wrong and perhaps it's not what point 1% that's one in a thousand Perhaps it's 1% 100 perhaps it's 10% on a bad day But every time that they didn't handle the response correctly, what do the application do? crash Simple as that So Microsoft use a bunch of smarts. They know their platform They were able to query the structure of all the applications. They tested about 3,000 applications and With these applications they automatically generated test cases that generated the crashes The really clever thing that they did is they had what they called fault inducing modules and the fault inducing modules Were able to do things like injecting a non 200 response code into any network message that came back They're able to emulate an inpatient user now stress all this testing was automated There wasn't a human writing test cases. It wasn't the human doing the testing It was all running in a vast collection of servers a bit like AWS the Azure cloud and stuff What it ended is sent back to the developers The test cases that reproduce the failure and said just so you know you're up in our app store Thank you. We love you and here are the crashes and here's how you reproduce them So given we've got this in the industry and this is definitely exists today. They did the work probably a year or two ago When we're sitting down and we're saying we're a great mobile phone tester. We can sit there and do this Well monkey testing can do this We can automate it and do it a lot faster than you can do it But perhaps there's still something that's there. It's worthwhile for the human beings to do Such as getting the projector working. I'll wait 30 seconds before moving on to the next topics I prefer to show you the slides That was a technical get a 200 response code from the projector So I'm gonna escape out of presentation mode It'll take about 30 seconds my glorious laptop while it's getting confused until we can get the projector working again And then I'll continue with the topic While we're doing this by the way my computer goes into a little loop for about a minute So it doesn't actually let me do anything while it's fighting with the Projector all right. I'll move on and I apologize if I end up doing this out of sequence The next thing is how many of you have heard of exploratory testing? Anyone not heard of exploratory testing So the concept of exploratory testing comes from work about 20 years ago by two people James Buck and Cam Cana Cam Cana is a professor in computing and in law in Florida and James Buck is a fairly well-known software tester And if you haven't met or known about him then do look him up and his work similar with Cam Cana Which is CEM by the way Cam Cana? But they came up with this concept called exploratory testing How many of you have heard of ISTQB? Not many of you okay, so ISTQB is the International Software Testing Board and The concept of this which came around 10 years ago And there's an older UK based system that predates it is that people can take tests to demonstrate some level of competence In a subject such as software testing you have a whole bunch of other ones And it became an international one rather than the UK based one and they have different levels of certification So there's a perennial argument in the industry is if someone has got the certification They're qualified whatever that means and if someone believes that they don't need a qualification Which one's better at testing? Make sense So there's some research done with 20 people who've got the certification 22 hadn't it was actually done in the UK and for those of you here over this week Isabel Evans was speaking here. She was one of the people who was actually in the test. I found that afterwards So what I did is they compared the faults that they found and they found that the people who'd had certification Had a structure that enabled them to continue testing when the other ones petered out But the exploratory testers tested more broadly and found more interesting bugs This is a fairly early piece of research and by that I mean I wouldn't stake my life on saying that either a or b is better However, it's very clear that a combination of having some structure can help us Providing that we don't become blinkered and limited by the structure and one of the biggest weaknesses or challenges I see with any certification scheme is that it has a tendency to help us think within the bubble and say I know that technique it's called boundary value analysis. I'll use boundary value analysis rather than finding other ways to do our testing We've got any hope of this working If necessary we can flip to VGA Okay, thank you. Oh, you need it that way. If not, I've got a VGA VGA adapter because VGA here or the walls VGA here. Oh No, that we need the mini port. That's what we're gonna do We'll put that in first and see because otherwise I'm gonna have to go and raid my bag and it'll take an extra minute Sorry, yes, let me take questions. Good idea So the question is when there's a crash I'm gonna read I'm gonna sort of say what you said and I'll try and rephrase it So when there's a crash, perhaps there's also a problem with asking for permissions and Does this this thing I showed you is it just a concept or is it real? so it's real There's many many things we need to think about when running a mobile application in Such as with the newer versions of Android and with iOS the user is asked for permissions and how many of you work with Selenium and web driver The standard sort of web testing frame with quite a few of you So you're familiar with the same problem of dialogue boxes popping up that you have to sort of blindly hit okay, too Well, the same can hold true for this now with the mobile platforms Typically, they're easier to interact with now. I'll tell you why it's easier to interact with Well, it's anyone here for the keynote a couple of days ago from ING There's a chat from ING. Is there one thing that you remember about the speaker? The speaker was blind So he couldn't see the screen. He can't see anything But he still uses technology. So how to use technology? He uses something called accessibility layers and Text-to-speech and it speaks aloud to him and all the rest of it And there are millions of people who need this help in the devices it enables them to live better lives It's not limited to visually impaired people It can also be good for us to listen sometimes to our phone instead of looking at the screen like when we're driving We prefer it to tell us turn left or turn right rather than else we have to look at the screen, wouldn't we? So the device can talk to us Well, both of the main platforms of standardized an accessibility layer and this accessibility layer allows us to interact with anything on The screen because otherwise how would the accessibility software work? So that will enable us to always find an extra dialogue button click on the dialogue button and then continue So that is a problem can be solved with the Microsoft work They pick 3,000 applications and I don't know whether they met any problems or not They didn't report that in their papers They've written a series of papers on this topic and academic papers are typically about 10 pages long a fairly dense text So they may not have worried about that or perhaps they ignored a few applications, but it's a solvable problem I have no doubt about that to do this the more important problem is particularly if you have applications You know that only a tiny percentage of the errors result in crashes There's lots of problems to do with usability or missing elements. None of these have crashes And this is one of the areas where human beings Can actually help augment the testing done by humans One of the things we used to do back in my days at Google is we used to automatically Test all our applications overnight and we'd take little screenshots of them And we'd have thousands and thousands and thousands and thousands of screenshots every day The human being is extremely quick at comparing them now a computer computer can compare them pixel by pixel That's dead easy. Do they every pixel match? Yep done. Yep done. It's the same as yesterday. Probably okay We'll assume it's okay yesterday. It's okay today if we see differences occasion We can take difference and say we're pretty sure that when we see this difference something bad has gone wrong The error message dialogue has popped up, but sometimes we're not sure whether it's right or wrong Examples like the app diff. It doesn't know whether we intended to change the UI or not It just tells us it did and it allows a human being to use it make the informed decision So that's all part of this and typically automation only checks a subset of things What I'm trying to challenge us to do is to think about how can we use automation in ways? That's we no longer need to do that testing so much because it will find out a lot more for us It will test broader faster and deeper in certain areas and then we can refocus our testing So I'll talk about a little Indian company here some of you will have heard of a company called Moolia. They're based here and One of the things that Moolia have been experimenting with a guide to help people test more effectively. I Think it's still in the prototype stage. I played with it in the prototype stage For those of you who've tried exploratory testing One of the concepts that people use for exploratory testings a little Combinations of letters that help us to remember types of testing. So I want to make sure I test the platform I want to make sure I test the data. I want to make sure I tested the functionality and the operations and That becomes a little acronym which is known as San Francisco depot So structure function blah blah blah data etc etc There are now hundreds of these including quite a few that being created such as I sliced up fun for mobile applications so far so good When we look at testers and I talked to a lot of the testers at Moolia over the years And I tried to get them to tell me how effective are these if you're asked to test see Communications which is used in one of these acronyms you actually find real bugs or not because all of us have got limitations in terms of time and stuff But very few people go back and record when I was doing the data testing these are the types of problems I found and next time I test an application that's similar I want to make sure I test data because it seems like there's certain classes of bugs there So what they've done is create a prototype and again until the computer does something Don't want to reboot the computer Well, it's just let's just do the big holding down the button for every day And then after every day you press a little button magic happens So normally I have two laptops with it by the way The other one's one is super cool Mac books with a USB see but almost always hangs like a projector in which is Five years old, I thought even a five-year-old laptop would cope, but perhaps not Anyway, when it logs in I'll type in all the rest of it and we'll Hopefully back and running this fairly quick normal Mac It's another couple of minutes waiting for PowerPoint to be happy Let's just log in Give it another minute before we put the projector in otherwise It'll go horribly horribly horribly wrong as my 4,000 browser windows now try desperately to connect to the internet And I've turned off the Wi-Fi So when it's finished with all that we'll plug the projector in and go to PowerPoint So what we've done is they've created a tool to help us guide the testing and Collect the results of the testing that we're doing It's fairly early stage work. It's not just a prototype a lot of the ideas are not yet built into the tool But what I'm hoping will start doing is will start mining data I'd mentioned to you Microsoft Microsoft research in my view are doing some of the best research in the industry and things like testing Mobile applications and also in software analytics now software analysis is a broad topic It says everything that we do on a computer is digital Every key we type everything that we every time you file a bug every time we check in code and stuff Let's see. I'm gonna do this one. Oh Good old rebooting the computer, I think actually. Oh, this is the old version of the slides It's not this one do Things so I'll go fairly quickly. This is San Francisco depot This whole long list is taken from some websites that's meant to help you think about testing So I just threw these in in between them And what you can see is there are tens of different words we can think about so what happens? There's something's absent. What happens is about sequencing concurrency where little bugs happen, etc, etc Remembering while you're taking pretty pictures. You can actually have the slides So this is the test a little thick stuff from Mulya And the idea is you sort of fill out a little template of the sort of things that you want to test and what you to focus on It allows you then to pick some of these models these heuristics And in this case, I particularly want to test this one called spies It provides a little guides for people on how to do this testing. I mean all this is available on the internet You can do this by yourselves the differences in the data collection So in this case, I'm picking this one called spies and why am I doing this? I'm doing this here. I want to test. I wanted to test searching and Things that affect searching and things like special characters It turns out in our Wikipedia application. Our searching is very weak. It's got lots of limitations in it One of which is it requires you to type everything in the right case Now for English most of our words are in lowercase unless we start a sentence Or unless it's someone's name like Julian or the name of the city or a country So we're fine with this pretty much not perfect We can kind of cope in languages like German where they capitalize every now and it becomes a big deal And it turns out we're getting lots and lots of grumpy users who spoke German Because town names were normally compound words like selam Z and this Z is actually a Z The weird thing was the first character we search case in sensitive Now what this happens it fooled the user into thinking are they don't need to worry about case because when I type in the first letter It kind of works. It starts giving me search results And I can see what I'm looking for in the search results as the type more characters in then the result disappeared So cutting long story short for the testing. That's the sort of thing. I wanted to focus on This is an example of how it sort of helps guide me through the testing and make sure I cover the different areas And then we've got some sort of reporting showing the progress and this is where we start to learn from the data So I want to talk a little bit about analytics. I My PhD is in using analytics to capture information about how the mobile apps are being used because I believe it can help improve the testing and development practices Well, the nice things is we're no longer limited by our thoughts and our understanding of the users We're actually getting real data from roughly the whole population So if we've got two million users, we're getting data from roughly two million of them for those that are in flight mode Who haven't got a Wi-Fi connection? They're blocked by the proxies. Maybe we don't get their data Maybe get sent to them when they're not sent to us when they're next on the network depending on the libraries are using But roughly we're getting a hundred percent of the population It's virtually free to collect the data and hopefully it can help improve what we're doing So I'm looking for how could we use this for instance to help improve the testing that we're doing to give us guidance I've mentioned this Microsoft model. So this is probably worth spending extra minute or two on When we look at the work we do Given that we've got all this data like all our bugs in the bug repository We tend to spend our life here We run reports and we say how did it go last release out less sprint whatever it is. Oh, well, we've got one set of one It wasn't too bad. We've got three set of twos and a bunch of set of threes that will never look out again in our lives But we don't necessarily look at the factors that cause this What was the reason that we had a save one? Was it because we didn't think to test on the older Android device and test the settings that we had? Was it that our people didn't have the skills the knowledge of the understanding? Was it a bug in the understanding of the developer and how they wrote the application? So once we start to look at this we can look for common factors and hopefully address the common factors Again, I think I mentioned this one briefly when you release software if you've got a production system You're running in the cloud etc. Etc. We've made some changes We need to look at the real-time alerts are happening right now this second And the rest came to my workshop on Monday and ran out again because something was going wrong right now But guess what he has to go and solve the right now problem So we know that something's wrong. We then need to decide what to do next now with a mobile app Typically we have to decide can we disable the new feature if necessary some there's some toolkits that like to do that Do we stop new users downloading the application? Do we essentially pull it from the app store and just let them download the applicant the old one not the new one? Do we wake up our developers who sadly were awake till 2 in the morning making the release happen? We wake them up again and say hey, can you come back into the office because we've got this crisis going on Do we staff customer service with more people who can answer the call and say yes customer? We know it's wrong. We're really sorry. We'll make sure we fix it Do we have people to go into the app store and when people are writing the rude one-star feedback for their response saying? Actually, we do know about the problem. Thank you very much for telling us We're working on it now and please come back later on today and see if we've got the update We think we'd have it ready for you in a few hours subject to the app store This becomes really important because for anyone who uses mobile apps. You're familiar with the concept of ratings And how many of us will install an app with two stars? Okay, why would we ever do that? You know, so we're taking the wisdom of the crowd to say this app is crap So it's actually a difference of point one in the ratings can materially affect your app by about 10% in terms of the downloads So it affects a search order that affects the in-app purchases. I haven't got time to go into the details But it really matters. So when you're responding to your users, what happens is you you were the person who wrote the first one-star your app stinks We responded to you and said, thank you very much for your feedback. Whatever your name is Jack Fred You know, we're gonna deal with it in the afternoon and you've just found the same problem You're coming on the app store to give us one star. You want to make sure it hurts Now if you see the response to his one, you say, oh, they're dealing with it I don't need to file a report anymore. I'll just come back in the afternoon If on the other hand you saw his complaint, do you think there's nothing there? We haven't bothered responding to you. You think they're lazy good for nothing developer I'm gonna make sure they pay attention. I'm gonna write one star and you're gonna write one star And you're gonna write one star and we're all gonna write one star because we want to make sure they wake up and fix The down bug now now now now a bit like I wanted to fix the projector now now now now So this is really important for us to deal with so sometimes the recommendation is the important side of it Not just dealing with the alerts Future we all like predicting the future. So the weather is going to be warm and sunny tomorrow probably since we're in Bangalore We're going to forecast the growth curves of our application 10% more users 20% 30% more sales And we kind of expect this to be linear By that I mean if we have more users perhaps we need more servers But it seems like if you have 10% more users maybe 10% more server capacity seems about right doesn't it? But sometimes we cross a boundary where the behavior of the system becomes different Most of us have water that comes out of taps So as we start turning out a tap we get little drops of water and as we turn it we get bigger drops At some point we take just a bit more it changes from drops to a stream But it's a fairly smooth stream coming out of the tap and we keep turning on we get more water and more It stays into a stream and eventually it becomes turbulent It starts sort of spitting out and going off to the weird places because the water dynamics change at two different transition points If we boil water from ice to heat in the water to steam we see transition points the same sort of things happen in computing While we're running within a resource Let's say it's available RAM the behavior is pretty much the same it may vary by 20% But in the grand scheme of things 20% isn't a lot When it runs out of available memory it says I have to put some of this on storage and that takes Microseconds milliseconds even seconds to do the behavior changes by an order of magnitude two order of magnitude three orders In order the magnitude of ten times a hundred times a thousand times to make this clear to you So a bit like my computer going stupid. It went really stupid. It wasn't going to recover by itself But what we want to do is to do things like fairly prediction models based on this. We're looking for the insights. I Have about five minutes later, so I'm going to speed up a little bit I'm looking for two types of things. I'm looking for testing based on information So if I know that my apps being used in Hindi or Gujarati or Tamil or whatever Perhaps this matters to me Perhaps I can change my testing and make sure I include the more popular languages If I notice there's a Xiaomi phone that I don't have in the UK Perhaps it's worth me doing some testing on a Xiaomi either by renting one remotely over the internet and testing by the hour On the big server farms or perhaps I buy one and take it home with me Things like setting the locales looking at the user flows as they navigate through the application I can't predict how you're going to use my application I can guess but once I know how you see I can make sure I reflect that in my testing We've talked about the crashing and what am I trying to do? I'm trying to get to fast reproduction I'm going to reproduce a problem quickly in my own environment the moment I do that I have a chance of being able to fix the problem rather than just guessing and that's what we saw With the pre-launch report and our application the moment we saw the problem We kind of knew what to do and fix it The insights is the stuff that's non-obvious So things like the rate of change we suddenly know that usage is going down on these devices Did we expect that to happen maybe something like our app no longer installs properly on these devices Does it matter to us? We're not getting data back. It's sometimes the absence of data is as important as the Availability of it. I've mentioned about the threshold. So what we're trying to do now is maximize our learning This is part of my research. Sorry. It's a bit faded here But essentially what I'm looking at doing is taking the analytics events of things like the update of Android for instance A new device being launched a bunch of tests that we already know and love or hate and Be able to recommend things like run these tests first Do to run those ones and actually generate new tests I'm doing some work in that area at the moment. So if you're interested by all means get in touch This is to remind you and again. Sorry. It's a little bit faded on the screen This book is called the glass cage, which is should be written there, but it's too faded out And the question is who needs humans anyway? Most of you have been an aeroplane The airplane has automatic pilots So the automatic pilots have been around for about 40 or 50 years They've been commonplace in things like the air buses for the last 20 and ever since flyby wires There's no physical connection between the pilot and the controls and the engines and the ailerons and all those other magic things So what they've looked at is that the more that pilots use the automated pilot and I flew here for nine hours from the UK The pilot may be in control of the plane for less than three minutes of the nine hours They may do the takeoff and they may do the landing depending on the weather conditions The rest of it the automatic pilot flies So far so good the pilots refreshed Doesn't it sound wonderful? Hopefully the pilots not asleep. That's why they have two of them in there one to wake the other one up Hopefully, but the big problem comes When they computer doesn't know what to do next so the fairly famous flight is a flight from Brazil to Paris It's an airbus 330 happened about five years ago, and it was a stormy conditions flying at night I'm gonna fly back tomorrow morning before dawn probably It is fairly typical. I flew here at night. So they were flying at night weather for conditions not too good In fact, it was really crappy and there's something called a pitot tube But it measures the airspeed and you know when you look at the little thing It says you're flying at 9,000 kilometers an hour. Whatever blah, blah, blah It's all wonderful. We see a little moving map and the aeroplanes and the big ones Essentially, it's used also to help control the engines and everything in the aeroplane and something went wrong with it I mean the computer essentially says I don't know what speed we're flying at I can't fly anymore. Over to you pilots Yippee How's this field to be a pilot at 33,000 feet in a storm knowing you don't know what the airspeed is? What did the pilot do pull the stick back? What happens when you pull the stick back the plane goes up? Trouble is the plane slows down the plane went into the stall went into the ocean everyone dead Took five years to find the black box. So about 300 people died And the problem was they did a bunch of research afterwards and take pilots into the I forget what you call it, but they practice in a sort of fake aeroplane as a simulator is the word for it They took me to the simulator They found that after six weeks of not flying an aeroplane the pilot has lost a lot of their competence to control the aeroplane The pilots who had flown the old aeroplanes like the Boeing 747s Which have fly by physical cables had a long-term memory that helped them still fly Even if they hadn't flown for a long time, but newer pilots don't learn that This book then looks at doctors The same happens for doctors in the UK. You have seven minutes to greet the patient Diagnose a single fault a single complaint. You got two complaints. I've got a cold and my legs broken We can do the culled or the leg not both two appointments. Go away And in seven minutes they look on our computer. They come up. Ah penny-sill in, you know, okay You're gonna get 16 pills for pay so you can go. Oh, by the way, thank you very much for visiting So they found again with medicine that doctors are now relying on computers more and more to make decisions And there's less and less clinical knowledge. Now in some cases the computers are great They can diagnose down to the genome level, but we're losing skills. It's happening in our industry It's starting to get I'm working a bank at the moment at the order programmers by the tens. We need 40 of them Pills competence. Good luck So remember this and remember that as human beings we have a part to play and that's part of the challenge Now I'm kind of out of time, but no one's stopping me yet So let me give you this example This is an example of what we've got is a record is an academic paper And they're saying essentially each time there's a key press these little red-ringed thing is we have a little blue thing It's a network message Is this a bug? We've got all the data we need here We've recorded every key press we've recorded all the things that's happening on the device tons of traces It's a bug. Come on. You're my computer. You're the algorithm. This is a bug You can give me give me a guess. I Can't continue without you. Okay. Thank you. Someone's bold enough to make a decision. It's a bug. What's the bug? Let's see if the correlation we can we know this correlation, but we don't know what causation is The causation is really important most of us are using a search engine where as we start typing it starts giving us results That vary as we type So if this is sort of Google search for a mobile device Perhaps it's exactly the right behavior every time we type a character It sends a little package to the server the server sends back new responses. This is what we want If however, this is an application that's sending every key press because it doesn't know what to do with it Perhaps this is a real bug that we've got in the application It shouldn't be sending network messages until the end of something like when we've completed the order for that form The important thing is we don't know what it is and human beings are very good at discerning what it is We can look at the application. We can make some educated guesses on this Again, I'll be very brief in terms of timing This is talking about smart cities and smart homes. I happen to be at some talk on it This bug is a bug in the Linux kernel called dirty cow. Most people had a Linux Torval's The Linux Torval's is the inventor of Linux This bug was one he tried to fix nine years ago and failed It came back last autumn And I was thinking through this if Linus Torval's had problems fixing this bug ten years ago They eventually fixed it by the way Perhaps there's ten people in the world who know how to deal with this is this our new world order That all our testers need to be really brilliant people who are better than the machines and better than the program as well Rest of it in which case good luck to the rest of you. Yeah, maybe I'm the elite but not even I count in this level But also this is a newspaper 22nd of October in the UK You know the if you look down here, there's some little cameras in many of the rooms Certainly all the elevators have cameras in them the foyers of the hotels have cameras The efforts are full of cameras people buy the cameras at home so they can watch that little child You know when you're sitting bored at work, you can log in remotely to your IP thing What's your child crawling on the floor? Well done remote parenting But perhaps you also on holiday and you want to see whether someone's broken into your apartment or whatever so far So good over half the cameras are made by a single Chinese manufacturer Trouble is that they're not hard to break into remotely This is a list of passwords There's a website of all the passwords of all the web cameras. They know about from a through Z Including people like Bosch who you'd think we know better Samsung Hang on a sec. Let me guess how long it's gonna take to work out a password of admin and one one one one one one Even I can't use Google to search for it So what happened is millions of these were remotely controlled and told to all attack PayPal And try and take the site down So this is the stuff I would like us to focus on how can we design devices? So that we don't have the stupidity of people installing them with default username default password It can be done. Maybe we should do so. I think I'm out of time But you'll have the slides and you can read up about learning from medicine The US are trying to reduce by 250 billion the cost of wasteful wasteful prescriptions of tests and medicines Final thing I'll end with is I like cycling The latest lights now have sensors in them that they can detect potholes in the road Now the only country I've not cycled in the world pretty much as India because your whole beat my bicycles But these can actually send back the data through your smartphones and give you a map of your city with the potholes So with that I'd probably better end and say this is me Julian. Thank you to these nice people I'll probably have to take questions outside so the next speaking come in and I'll be around for the rest of the day Thank you