 Thank you and thank you everyone for coming, so my name is Adam Dangor, I work at a company called Misesphere, building an operating system for data centres. But last year I was working on something quite different, I was working on the back end of an iPhone app. Now what you do is as a user you would take a photo of a wine label with your phone and the app would tell you all kinds of details about that wine. Or at least it was something like that, I'm going to protect my NDA here today. And our app was a Flask app. Now if you don't know Flask it's a really simple web framework and it looks something like this. Now a really cool thing about Flask is that it provides, give me a second, a Velxoig test client, I hope I got that right. And what that means is that you can make requests against an in-memory application and you can get response objects which you can inspect. So if we look at this test here it kind of looks like we've made an HTTP request but actually everything's being done in memory in our test. Now in our wine recognition app we use something as well called Vaforia web services. And basically what Vaforia web services is, is it's a tool that lets you upload a whole bunch of images, let's say in our case images of wine labels. And then when a user uploaded a photo to us well we could send that image to Vaforia and then Vaforia would tell us which one of our previously uploaded images their photo most closely matched. And then what we could do is we could fetch details about that wine from our database and tell the users details about that wine like how much it should cost, how well rated it was from, exactly where it's from, that kind of thing. But when we built our prototype well we kept finding loads of problems, loads of bugs. And in particular those bugs came from assumptions that we'd made about Vaforia which weren't quite right. Often that came from reading their documentation and trusting that it was truthful and full but you can't always make those assumptions. And so what we wanted to do, we wanted to add tests for our matching workflow. And that matching workflow of course used Vaforia and we wanted those tests to be in our existing test suite. Now Vaforia here was accessed over HTTP. And that's what I'm going to focus on today. But the general idea is really on specific to HTTP. Because you might want to test code let's say that uses a database for local storage. Or you might want to test a deployment workflow which uses Docker. Or maybe you even want to test code which uses Amazon S3 or some other cloud storage as a storage backend. Now we were lucky we had a very clear idea of what we wanted our first test to be. I know this is quite a lot of code to have on a slide. But simply what we wanted to test was that if a user uploaded a photo of a wine label which matched a photo that we had already added. Well then they would get details about that wine. So I wrote a test that looked a little bit like this. I had two wines here. Add wine let's say it adds it to our database. But it also uploads it to Vaforia. And then I check that I get the right one back when I query the match function. That match function uses Vaforia on the back end. Now with some third party tools maybe even some of the ones I mentioned like Docker. You might be totally fine totally cool to call that real tool in your test suite. But when we call Vaforia in our tests we actually hit some problems. Now first of all we were at the mercy of the network. And what that meant is when our CLA system had a little network glitch. Well then our whole test suite would fail. Because our tests made HTTP requests against the internet. And we didn't know if those failures were because of the network failure. Or because there was some kind of flakiness in our code. But also we were at the mercy of Vaforia. So similarly when Vaforia went down or went down temporarily. Our test suite would fail and it really does slow down development. If you're constantly worrying have you made a mistake or is it on their end. Now say you're using a real service like S3. S3 might be pretty stable probably even more stable than your software. So you might not have to worry too much about flakiness. But S3 charges you per megabyte you. So if you want to use it in your test suite. It might actually become really expensive to run your tests. You might have to pay per megabyte. And just spend quite a lot of money. And another problem that you might run into is resource limits. This is definitely something that I've hit. A lot of services have resource limits. A certain number of requests that your account can make. And so if you call something in your test suite very heavily. Especially if let's say you're doing performance benchmarking. You're making a load of calls. Well then you might hit those resource limits and you can't run your tests anymore and you're pretty much blocked on development. And even when those things weren't problems. Everything was really slow. So Vaforia is quite advanced software. It does a lot of processing magic so that it can do the image matching. And that means that after you've uploaded an image. Well it takes a few minutes until that image can be matched. That's totally reasonable. I don't think that I could really expect them to do it instantly. But in our test suite. Well I didn't really want to have to wait a few minutes to know if our get match code worked. So we called these tests like the one that I showed you before integration tests. Because well they tested the integration of our software with Vaforia. I think a lot of people get confused about the terminology. Some people call these things acceptance tests or end-to-end tests. But I think we can agree that they're high level tests. And they were definitely useful. They really did help us track down some bugs. But we also wanted unit tests because unit tests give us a lot of benefits over integration tests. And in particular they tell us if our code calls Vaforia correctly in this case. Even when Vaforia is down. Unit tests are also really small in scope. And what that means is well let's say one fails. Not all the time. But often you know exactly which part of your code failed. And if you change that bit of your code to make the unit test pass. Well that can be a small isolated change. And when you've got unit tests that run quickly and are small. Well you can even use some tools maybe like hypothesis to generate a whole bunch of unit tests. So what we want to do we want to turn a code base which can currently be tested only by integration tests. Into one which can also be tested with unit tests. And one way that some people achieve this is by using mocks. Now roughly a mock is some code which provides the same interface as something that your code calls. But it reduces or it removes some cost. And in this case the main costs that we cared about like I mentioned were time we cared about those slow tests and flakiness. But again you might want to avoid financial costs resource limits or all kinds of other costs that can come into your test suite. So my goal was that wherever code under test made a request of Vaforia at least in our unit test suite the tests would make sure that that request was actually handled by a mock function rather than going over the web. Now we were very fortunate we were using the requests library that I'm sure some of you at least are familiar with. And there are a few ways with Python to get requests which are made with the requests library to point to some mock code. And the tool I chose is this one it's called requests mock. I know there's also another one by the folks who make sentry called responses. There's also something called HTTP get if you're on Python 2 and maybe you're not using the requests library. Now the simple requests mock example is this one. So what you can say is here when I make a get request to test.com return the string that says data. And that's pretty simple. And at the same time as using requests mock I'm sorry person who tried to take a photo the slides will be online. And at the same time as using requests mock we were also using pytest. Now what pytest is is it's a test runner which gives you a really neat way to do set up and tear down for test requirements. Now that feature is called fixtures and we have a fixture right here. And what this one says is hey if I use this fixture then requests in this test will be handled by mock code. You can see we yield when we're in the context manager. But I'm sure that if you're using a more traditional test framework you can use just the normal kind of set up and tear down methods. Now what I wanted I didn't want to return the string data or something like that. I wanted some quite advanced features in my mock. And in particular I wanted to have a stateful mock. And that would allow me to give different responses based on previous requests. So I could give a different match response if someone had already uploaded to the mock a picture of a matching label. So I used a requests mock feature which let me use a callable instead of a predefined response. And that callable takes a request-like object that gives me almost details of the request. So we created a whole bunch of small mock functions for every endpoint we used. And at this point we'd pretty much achieved our goal, right? We could test our code without touching the real viforia. But then we hit some more problems. Problems when we were using that mock. And I actually think that these are problems that a lot of mocks face. And sometimes we found that we'd copied the interface correctly. You know, it can be pretty hard. There are lots of edge cases. What if the image is too big? Do we give the right error back? That kind of thing. And humans make mistakes even with code review. And so we found that we'd copied a lot of things incorrectly. But then even when we were extra careful, we found that the mock quickly became updated whenever viforia changed. Now if they sent out a really nice change log, we could change our mock to match it. But that's not always the case, especially for very minor things. And this isn't, you know, a Python library where you can even inspect the code changes. This is a web service. Now when you have an outdated mock, you have quite a serious problem. Or at least what was serious for us, which is our tests pass, but our software is actually failing in production. And when you've got that, you can have a real difficult time tracking down exactly why your code is broken because everything looks like it should be working. And you have to find, oh, actually, my mock is wrong. Where is it wrong? Trying to basically remake those manual requests to check your mock. It's very tedious. So that was a contract gig. And that contract ended. And I kind of felt like I built an OK solution. It was working all right for the client. But I really felt like the problem could be tackled in a better way. And that I could have provided a better solution if I'd had more time. And in particular, because we kept hitting those issues of Vaforia changing and of human error. And at the same time, I really believed that Vaforia, and I still do, could be a genuinely useful tool for a bunch of people. And it could be especially useful if it was easy to develop against. So I set out to make VWS Python which is basically an open source library for using the Vaforia web services with Python. It's in progress, hopefully coming very soon to PyPI. But I also had another goal. I started testing it with an open source mock, part of that library. But I realised that the mock itself is very useful, whether or not you're using my library. And I wanted to ship that mock to people so that if they were writing code which used Vaforia, well then they could have the mock for their own tests. So I wrote some integration tests for the library and I wrote some unit tests for that library which used the mock. And I put the test suite on Travis CI because I knew it and because it was free for open source projects. And one really cool feature of Travis, I'm sure a lot of other CI systems share it, is that I can give it the credentials for Vaforia. And I don't have to have those credentials show up in the code base where someone can abuse them. But I also don't have to have them show up in the logs. So I could really use the real service even from a CI system. And every time I made a change to the library, the tests will run and those integration tests run against the real Vaforia. But if you remember the goal I set, I wanted people to be able to use my mock to test their code whether or not they were using my library. And there's a cool way to let even people who use different programming languages, not just Python, to use your mock. While you're still keeping the interface really nice and pleasant, if you remember we had a PyTest fixture or if you're not using PyTest, just a context manager or decorator. So you want to keep that for Python users, but you want to let other people use your code as well. And the way that I did this is, well, I built the mock in a way that meant it could be run as a standalone server. And what that meant is ditching the requests mock syntax that we had before. But at the same time, well, no, I'll move on. So I wrote this little bit of code. I'm not going to get into it too deeply because maybe I'm a little bit embarrassed. It's a bit of a hairy hack. But really it let me rewrite the mock as a flask app. And keep using it with requests mock. So that means that I've got a flask app that I can just run as a standalone server. But if I use this code, it ties it into requests mock. So let's say what it does is it translates those request objects from request mock into something that can be used by the, I guess, the Verkzoik test client again. But then you can use it to give Verkzoik test client again. But then you also translate responses from that test client into something that requests mock can use. All those code will be online later. So if you're not using Python, then what you can do is you can spin up a flask app, let's say, in a docker container for every test. And then you can route your requests to that container using whatever kind of request mock alternative your language has. And that can be particularly useful, especially even if you're on an old Python version that doesn't support my mock's code. So I'd say this, if you're in an organisation and you're writing a mock, and you want that mock to be used across your organisation even if people they use different languages, this is a really cool way to do it. So back to writing the mock, this time around, the mock was definitely part of my product. So I didn't want to just do it in an ad hoc manner. I wanted to write those tests that confirmed it was doing what I wanted. So if you think about it at this point, I'm kind of probably duplicating a lot of the work that the people at Vaforia did. I'm rewriting a bit of their service and I'm also thinking about edge cases for it, and what I'm doing is very manual. I'm making requests to their servers with those kind of edge cases that I'm thinking about. Then I'm noting the responses down in tests and then I'm making sure that that test passes for my mock. I test things, especially that aren't mentioned in the documentation. So let's say one example is they take a width for the image in centimetres. What happens if you give it a negative width? Well, I did it. I found that they gave an error. I copied that exact error into my mock and then the library which is the main product handles that error and it raises an appropriate Python, a nice Python exception for that error. So at this point I have three sets of tests. I have a few integration tests which use the test library with the real Vaforia. I have a whole bunch of unit tests for the library, maybe hundreds and thousands if you count those which are generated by hypothesis and those use the mock. Then I have some unit tests for the mock itself. But I'm still vulnerable to those problems that I mentioned earlier. Copying incorrectly and Vaforia changing which will render my mock inaccurate and now my library possibly even broken. So turning a mock into a verified fake which is the title of this talk is all about avoiding those problems. Now what a verified fake is roughly is it's a fake implementation which is verified against the subset a subset of the same test suite as the real implementation. Now I don't have the Vaforia code and I definitely don't have their test suite if they've even got one. So if I wanted to make a verified fake which I did, I needed to have my own test suite. So turning the mock into a verified fake really meant making a test suite which ran both against the mock and the real thing. So if you recall that simple pytest fixture from before well I expanded it. So pytest has this really cool feature called parameterisation and you can parameterise fixtures so that tests which use those fixtures are run once with each parameter option. So here I've got a simple true false and I map that to use real Vaforia or not and so any test which uses this fixture is run twice. So it's run once with the real Vaforia and then once with the mock Vaforia. So these are the test results. They look something like this. You can see each test runs twice and fortunately I already had at least the start of a test suite for that mock so the first thing I did was I applied this to those tests so they ran against the mock and the real thing and of course I found that I'd made a whole bunch of mistakes. So now we've got a verified fake and we have a test suite which runs against both the fake implementation and the real implementation. Now because the mock has been turned into a verified fake we actually trust that it's representative of the real Vaforia so we have loads of confidence in those hundreds of tests that we had for the library and we know that they don't just rely on an unrealistic mock but we also had another problem if you remember. We were worried that Vaforia would change and that would make our mock inaccurate. Well now whenever these tests pass I know that the mock is still a faithful representation of Vaforia and we only incur the cost of running 100 tests against Vaforia but we get almost the whole benefit of running thousands of tests against Vaforia. So we lessen that kind of cost of flakiness and slow tests but at this point our tests only run when we make a change to the code and that often especially once it's quite mature so we want to know what happens if Vaforia changes at that point. Well a cool feature of Travis and I'm sure a lot of other build systems is that you can actually set tests to run on a schedule. So there's this trade-off if you run them all the time you find out problems quickly but you hit those costs if you run them very rarely it takes you a long time to find out the problems so the trade-off that I chose was to trigger them every night but you can do them every release, every week just whatever worked for your particular situation. Now back to that width example in the wine application I talked about at the beginning we really didn't care about the physical width of a wine label it wasn't a differentiating factor but and also it was actually really hard to get that's why we didn't care about it that much but we told Vaforia all the time that the width was zero didn't matter to us and that always worked and our mock supported it and once later the verified fake also supports it and has a test that a width of zero is okay no error is returned, the image is added but one morning I get an email from Travis and it looks something like this and it tells me that the build failed so I look at the logs and I see that we actually have a very precise data point of exactly what's changed in Vaforia so the mock passes for this test but the real implementation fails and the test is well what if I add an image with a width of zero so now what I do I just change the mock function and the test so that the new behavior is represented by the mock and that's very easy but now if you remember the library's tests they themselves depended on the mock so now the library expects that a width of zero is valid but it's invalid so as soon as I change the mock well then the library's tests immediately started failing so I could change the library to give a nice python exception when user width of zero and what that really demonstrates is that really within a few hours Vaforia made an undocumented change and that introduced an incompatibility with my library and then this incompatibility was fixed without any real complex debugging and to me that shows the value of having a verified fake to any developer really who's writing code which integrates with third party software so now you can imagine that building a verified fake when you have the original source code is much simpler than when you don't and a lot of the fake can share well it can share code with the real implementation and hardly any web services are open source so this can be really valuable if you're shipping software to people if you're shipping software to people which they might want to call in tests well you can actually add tremendous value to that software by shipping your own verified fake and it might even cause someone like me to choose to use your software over a competitor's and if you make a verified fake as the author of the software well it's much easier because well because you can get told before merging any changes and make the fake unrealistic so you know when to make changes to your code without the need for that once per day test run so I'm hoping that maybe in the future having an API which is easily tested against will become kind of table stakes and one cool thing about making a verified fake well you don't really have to ship your secret source you can just ship something that does the bare minimum of your API interface you can just have a really rubbishy kind of image matching thing that's the core of your business you don't need to ship that to people so I hope now that you have a rough idea at least of what a verified fake is why it might be useful and how you can start making one for yourself and for your users maybe so thank you very much that was my talk maybe to take any questions thank you some questions hi very great talk and 80% overlapping with the one I gave two talks ago but you've got a case study which is great I don't I had the general discussion and I think your ending is exactly where it should be there is no justification for releasing a component without a fake the terminology I'm trying to use to use Martin's so the distinction between fake and mock so for example one thing left says is that essentially that a fake should be a spy that's not in the original a mock should be a spy that provides an introspection API perhaps with this case I didn't worry about it so much because the API itself provides introspection abilities the big thing that's missing here in my view is the ability to simulator the example I give is a CPU on fire you don't really want to be there with a lighter to give fire to your CPU to check that your code is handling it whatever should be programmable to raise an error so actually I've got a response to that first of all it's very difficult for the on fire case to verify it because how do you have a test that checks against and thank you for your question as well your comment when you say this is going to give a 500 it will give a 500 just like when their servers are down because their servers aren't down right now if you check out the source code for VWS python it takes a state object and so I have various states like on fire but not quite and so you can say just like I had this verify mock fixture or verify viforia fixture or verify viforia context manager you can give that a parameter which says broken, inactive slow and then you can see if your tests work even when there is a 5 minute delay in the matching ability I hope that gives you a little bit of insight into how I've dealt with that issue I really appreciate the talk but I was wondering in this kind of service you were mocking basically the response was depending on the data you put before how would you go about the service for which you don't have the ability to specify the data for example if I want to know the events in a specific location they change every day they are not in my control and how can I write tests against this data which I don't have sure so you can imagine that that API that event consuming API that you're doing let's say I think event bright is one of those companies or meetup.com that they also have an API ad event right but that API might not be public to you so what you've got to do is act as if you are the meetup.com person right you're the meetup.com servers and you just make some ability to add an event even if it doesn't have a mocked API that will be exactly like theirs and then you can know ok given that I've already added an event it works in the same structure now if you want to verify it well what you can do you can have a test account that has an event in a particular location with a particular image and then you can make your test run against that test account and have uploaded that kind of event into your mock already and then you can say ok I want to check this event and check that the response is exactly the same I hope that roughly answers your question but you're right it's not a solved issue it's not always that easy and it is context specific thank you Next question Is there any way that you could integrate this with ffuzzing to find out the API responses that you may not be able to think of your applications not using? Sure so I mentioned hypothesis before that's the closest tool that I've personally used to ffuzzing if anyone doesn't know it's a property based testing tool and it generates a lot of tests which is kind of what ffuzzing is I haven't actually done it for this because the request limits were so slow it was so slow and the request took so long actually the point of doing this for me was so that I could add ffuzzing to my code but you can imagine that if those problems weren't the case well you could say hey hypothesis or my ffuzzing tool you could learn random requests against my mark and the real implementation and check that they either are exactly the same in response or share some properties like they have the same keys that would be ideal but it really wasn't suitable in this case Another question? So after just the lunchtime Hi a really nice talk Thank you I wonder you have libraries like vcr or betamax which is ported from Ruby and they you can record a response and it's recorded in JSON and I wonder why you wouldn't use just like for day to day testing like that and then at midnight or once a day just disable the cache and see if the test passed then So yeah vcr tools are definitely something that I've used a bit but how do you know that the I'll put it this way maybe You have a very similar case right that the api can change and then when you disable the cache then you have to update your vcr responses and then you've kind of got a very similar thing but you might not have the add component if I want to hear add an image what do I do in a vcr system I kind of have sorry I don't have a great answer for that I'm going to pass on to the next one this is an alternative I guess to a vcr system no I think that people use vcr to vcr some other service I've certainly very briefly contributed to pi github a github api and what they do is they record responses from vcr really I try to avoid it because it came with its own set of problems and it was more painful for me to use than the system I think it's too thank you Adam thank you to you