 Thank you. It's a pleasure to be here. Yeah, the working title was a bit different I'm just a little bit about me. I'm a back-end engineer. I work on the core back-end team at Yelp And I got quite a bit of content. I'll try to be quick enough so that we have time for questions One quick word about my employer Yelp. We have a website and an app. We help people find great local businesses We're actually like pretty popular especially in North America. We have over 30 million monthly unique users on our mobile app We have over 130 million reviews. So we do work at scale, which people seem to care about So what I'm going to be talking about What are type annotations and why should you use them? This will be like very short like one quick slide this talk basically is Hopefully starts where like the more introductory talks end. So I hope you already know a little bit about type annotations We're going to talk to how to incrementally migrate an existing code base to using type annotations What are some of the issues you might be encountering? and The last part which is actually going to be a relatively big part is how can type annotations help across services? So Across network boundaries basically one thing I'd like to say is that I'm by no means an expert on type annotations I just I'm just going to talk about what I learned as a user. So With a grain of salt, let's Let's start. Oh, and I'm also going to use a bunch of source code. So if that's not your thing, I'm sorry I think it fits with With my talk That said if you want a bit more Like introductory talks or talks that explain a bit more the why and and the how In in the more basic sense, there were like two actually really great talks at Picon US earlier this year One by Karl Meyer from Instagram the other money the other one by Greg Price Who is now at Zulib was a draw pox previously. So I encourage you to check them out really good talks But let's give you like a very like quick Hopefully reason to use type annotations. So if you have this code Super simple function takes a string returns a string. We call it with an integer We'll get an error not at runtime, but we'll talk about like when in a second Now you might say this is a contrived example. And by the way, this code works perfectly well with an integer It does in like when you run it One reason yes for type annotations is to help you find bugs and issues with your code The other one is to actually document your code So what this basically does it tells you that your documentation is wrong or incomplete Which I think already is a big big advantage because the documentation is now part of your code And if it doesn't match reality you get an error. We all know otherwise. What happens with documentation gets out of date and the reason why Many companies with larger Python code base use annotations is more like functions like these Where you get items and you don't know what items is actually like this is basically an example. I I took from Karl Meyer and We we know it probably like has a value attribute and that has an ID, but we don't really know And it gets even worse if it's a dictionary instead of like some other type of object So as you have a large code base that multiple people work on you probably joined the company like years after like this code base was started Type annotations really help with understanding What's what's going on? So let's talk about how you might migrate this existing code base large code base To type annotations the goal here is that we want to End up that all code is type annotated We want to do this incrementally. So this is not going to be a code size We're like just one person can sit down and spend a day or two or three and just you know Burn through it and everything is annotated And we want to make sure that for the code that we have already annotated That we do run checks on it and make sure like we gain value out of those annotations To check annotated code, we're going to use my pie. I assume most of you have heard about it It's the de facto standard type checker It's like a sort of linter that you'll just run on your code just like I don't know flake 8 or whatever And it will tell you When there are issues with your code It's still like not at a 1.0 release. There are some small issues still we're going to talk about a few of them Yeah, but it's it's really good and it's improving rapidly. I'm also going to be using the Python 3 6 Type annotations because they're just so much better You can use type annotations with Python 2 or earlier versions of Python 3. It's just because it looks better There is actually another type checker in this case for Python 3 only from Facebook or Instagram like That area it's supposed to be much faster, especially for large code bases We're not currently using it internally at Yelp. So I don't have any Personal experience with it, but you should definitely check it out Yeah So let's get started How do we actually enforce annotations like my pie has a lot of options? And we need to configure it correctly so that it does what we want to do again like our Goal here is to on a file by file level make sure code is annotated. So that Whenever somebody Changes something we can we know if the code is annotated or not. So the first three part configuration options basically say that non annotated code is an error The next two One say that any code that I'm not like providing you as an argument on the command like like any file Yes, you need to follow it to understand what's going on But don't tell me about any errors or problems in there and Then this is an important option. We want to make sure that you handle none correctly This is like a big source of bugs and issues in many languages that don't handle none explicitly In and even in Python oftentimes. Yes You kind of make sure like it's it's a list or or an iterable or whatever But actually like are you making sure that it doesn't crash or throw an exception when it's none And for the cases where it can be none, you need to be explicit about Dealing with it this I think is going to be I'm not sure if it is already or it's going to be the default in my pie But just in case it's not you should probably set it and this is kind of like I would say like a recommended By me so my personal preference for how to do it There's like actually a strict option in my pie that is stricter than this But you do need to find like a trade-off between How much time you invest in writing the perfect annotations and You know getting code to production So how are we going to enforce? These annotations or that people make annotations We're going to use a tool called pre-commit It's it's an open source tool Open source by Yelp that provides All kinds of hooks that you can run when developers commit code There's like a lot of hooks for a lot of different languages and file formats The Python support is really good because we're like a big big Python shop and Yeah, that's what's what we're going to make sure that that code is annotated So let's take a look at how to configure it. Basically. We need to add a separate hook that is my pie One thing we're going to do is we're going to pass it a special config file the one that I showed you a couple of slides earlier and and Yeah, this will make sure that That any code that is being touched is fully annotated to install Pre-commit basically you run a command just like this typically you can do this when like as part of your test suite or as part of your like Make or set up or whatever step. So just to make sure that developers have this installed And then when you run it or when you basically commit something this is how it could look like so you see there like there's a bunch of Tests or hooks that run on on the files I touched the one where it says skip This is they don't apply to that file type and you can see the my pie hook that we just configured is failing because apparently I I modified a file called media dot py and Yeah, it to get photos future thing I used like it has an incompatible type Also, I forgot to add a return type annotation and also an annotation for one of the arguments In in that function one thing I'd like to say is that all of this is like based on The team like everybody agreeing that this is a good thing so you can definitely skip individual pre-commit hooks or all of them So this is like when I say and force it's just like making sure people don't forget Not forcing people to do things. I'm going to address this a little bit at the end But obviously everybody should be on board that that type annotations are a good thing But we do want to run my pie also as part of our test suite To gain some of the advantages of having type annotations. So we're going to use a less strict Config option for that. So it's totally okay to have non annotated code Also, this is going to be run on all of our code not just files that were changed or touched and Yeah, we're like what what is going to happen once you start with type annotations Is that you're not going to notice like a lot of Issues being caught by my pie when you run it as part of your test suite That is because as soon as it encounters a function that is not fully annotated It kind of like uses this any type which can mean it could be anything and it won't basically stop type-checking and and the value that function returns will also be any so It'll take some time to to get started to really see results but remember that those results are going to come and that Already like from the first annotation you add that's already documentation. So that already adds value to your project That said it can be like a little bit tedious to annotate everything by hand so there's tools now that can help you get started Probably the best one as far as I know or as far as I have tried is monkey type also by Instagram Which really helps it can not only connect gather type information as you run your code and Write that in a file. It can then also use that information to automatically Annotate your code using the Python 3 annotation syntax That's a great way to get started. Just remember that it won't it's not a magic bullet like a developer needs to go through Those annotations check them Sometimes they are too strict so you can loosen them up Sometimes they can just be plain wrong. So if you run them as part of your test suite You might see like a bunch of like mock annotations in there, which obviously you don't want to have in there Yeah, and the second tool that I think comes out of Dropbox is pie annotate, which we've also tried internally It actually also works reasonably well the main drawback is that since as far as I know Dropbox is still like very much a Python 2 Shop they do The annotations in this comment form that you know works in Python 2 as well But if you are on Python 3 or Python 3 6, it's My personal preference is to use that syntax. It looks much closer to Like types and other languages. So that's maybe the main drawback here So but one of the important like the the most important thing for type annotations is to type your data like I Said in the motivating slide Actually making sure you know what data Is passed around is super important I mentioned the dictionary example When you when you're there and you join and you get ramped up and you have like well in our case We work a lot with businesses and you have like a business ticked and then well what's in there I don't know and it gets even worse if you then have functions that like modify the data along the way So all of a sudden it looks a little bit different and Do I already have the mutated form here or still the original form? I don't know So that yields like a lot to a lot of like productivity loss bugs frustration and name tuples are a great way to kind of Deal with that. I personally like them a lot. Yes, and Python 3 7 we have no data classes Pick whatever you like best the good thing about name tuples is that they prevent Mutation so you can kind of like be more less sure that once they are created. They're not going to be modified And as you can see here, like I'm using the new Python 3 syntax for name tuples I think it's great readability you can nest them So here photos is actually like an iterable of photo photo is like another name tuple that I'm not showing here Yeah, and I think that's that's really great like it's really clear what your data structures look like and We probably want to use those now There's also a way to type your dictionaries if you don't want to or cannot use name tuples As you can see this is actually like very similar syntax. It's basically just a different base class And unlike maps or hash maps or whatever like dictionaries are called in other languages In Python you can have different types based on the key like I'm doing here so It will distinguish Based on the on the field name what the type of the value is so let me show you an example in this case, I Don't know if you can even like spot that but this code these three or two lines really only it has a bug The I'm taking the business dictionary And I'm calling to get function and I'm saying give me this address to field and if it's like none Then use the empty string because I don't want to like I'm trying to be a good developer preventing like issues with with none When I when the code down the line expect strings And this is really like an insidious Bug because this will work at runtime 100% of the time it will never crash It will never raise an exception But it will also never return anything else than the empty string because I forgot the D In that like as a second D in address 2 is missing. So unless you have like really good Test coverage and you like, you know because you know like your code is not buggy So what happens is your test fails? Oh like and then in your test mark or whatever You also like use address 2 with just 1d and all of a sudden your test passes all of these things have happened already So this is like pretty bad But if you use like here like I'm I'm using this annotation of this business stick that we see up above And what will happen is that my pie will tell you with the code as written here That business stick has no key address to So even if you need to for some reason continue using dicks Using the time to write these These typed definitions of your dictionary And then using them just for annotation like that the business stick class you can actually use it in code So you can instantiate an instance of it and things like that But even if you just use it as an annotation It it works really well and and can help you prevent issues like these Then yeah, then you're on your own so It won't I don't think it can know like so what I didn't mention my pie like is a static type checker so All of this like no it has no effect at runtime So if at kind of like quote-unquote compile time or just by looking at your source code It cannot figure out what what the name of the field is then because it's kind of like dynamic or whatever The type checker won't help Does that answer your question? I don't know if it does do that that's like very specific I would need to test it Yeah, but that said I hope maybe you are convinced that you know It's a it's a good idea to use name tuples So the question is how do you do that? One thing we could do is we could say like we have our name tuple definition we actually like use that as the source of truth of how all the data should look like and Since like we might get dictionaries from some like external data source JSON or whatever So let's write a small helper function that converts that dictionary to To a name tuple That's what it looks like I'll I'll gotta speed along a little bit We're going to use this helper function to see like some of dishes that we might run into when we Type our code. So as you might notice this function is not annotated. So how would we annotate it? One first try might be we just Use the name tuple type. So we say like NT class here is well It's not a name tuple instance. So it's a type of a name tuple instance And dick values well, it's pretty clear and we return a name tuple this Doesn't work because what we return is not a name tuple instance We care about like what specific name tuple it is not just a general name tuple about the second arrow We're going to see about that in a second So what do we do here we learn about generics And we try to use those we create like a type for here that is bound to name tuple It kind of like says whatever Class or type this is the base type must be named tuple And then we use that here and we say like whatever like class you pass into this function as argument We're going to return an instance of that as output so that Should fix the arrow we saw earlier. Unfortunately, this doesn't work either because of The fact that we cannot use name tuple as a base class That is also the reason for the error on the on the slide On the for the second era on the on the on our first try Name tuple is actually not a class it's a It's a class constructor that returns a class that there's actually like no common Name tuple base class. So what we need to do here is we need to use protocols as Which is like a relatively recent but really cool my pie feature which Implements or is basically the solution to duck typing so we can say like hey This is how this argument or this type behaves instead of Saying like hey, it's an int or whatever and here we can say like hey We need something that has like an underscore fields Attributed has like an underscore make method and then we return like an instance of that So here we say hey, it's a protocol fields make Then we read we can say that is kind of like our base type anything you pass in here We're going to I'm sorry Needs to needs to conform to that and all of a sudden it works I'm going to quickly go through we might not have time for questions at the end Some of the other issues so if you do use name tuples It turns out like count and index are like pretty common attribute names But there is a base class actually to name tuple which is the tuple so that is actually the common base class for all name tuples and They have an account and index Method so if you use the name tuple like this, it will actually work. You basically just need to silence these my pie errors Just something to to know another thing I'd like to mention if you have a larger code base Maybe you have a descriptor so like an object that manages an attribute or a property of another object in this case We have this set once property descriptor, which makes sure you can like set the property once But then only read it never write to it again It's like like a generic class. We can see here like two types the type of the object that we're managing the property of and The type of the value we use that here and get and set I'm just going to go through this the slides are online And here we can see we create a concrete type bizarre context by the way you use Strings if you need to use a type that is not fully defined yet and we say like the biz user ID in this case is an integer and My pie will understand that this user ID like you can only assign ints to it and when you read it It's an int I'm mentioning this because descriptors are not yet documented. So if you want to do that Please do so open a pull request Another thing you should be aware of My pie doesn't support recursive types. So anytime you have a sort of like tree or graph like structure It won't work You'll get an error There's a ticket to like a solution to this is basically to Ignore these errors and then use like the cast function that my pipe provides To to annotate by hand But typically whenever you have like these kinds of data structures, you're kind of like lose lose type information So let's try to get quickly through to my last part of the talk How to do type annotations with distributed code like a problem there is really like testability How can we make sure that like yes when we have our code calls several other services? That all of this works that we like the data we're receiving and that we're using that we're using the right format At Yelp we have an API specification. We use open API From that we create a client library that we then use to make API calls Just quickly this is kind of like how an endpoint looks like So it's a get request. That's the URL and we return a business object as response that is defined like this Looks very similar to the like name tuple definition or the type tick definition. Just maybe a little bit more verbose and This is how we make with this like client lib that we have at Yelp This is how we make the call. It's actually like for those who know it's the wrapper around the bravado library, which is open source So you can see like for us It seems like we're dealing just with Python objects and then it does all the network stuff in the background Here we create the client then we make the call this actually like makes the call and So let's write a small help of God Yes, a small helper function to to get the review rating from a business, which is the aggregated rating of all of the reviews And let's write a test for that because that's mainly the problem, right? Like typically you have to do end-to-end tests or whatever to test network code But maybe we can use marks and get away with it So if we do that We just use a mark here and kind of like fake have a fake business and then we set that as return value It's like a little bit unwieldy, but it it works like here in this case for our like client our mock client And this test passes now what I didn't didn't mention. I'm going to go back a little bit here review rating is actually of type string That's for historical reasons. There's actually an example that we do have at Yelp There were issues with like floats and and serializing and deserializing them in earlier Python 2 versions Not a problem anymore, but that's still the endpoint definition. So actually This code has at least an annotation bug because it says it returns float when in fact it returns string And then if down the line the code like gets to the wrong type That's probably going to be a runtime issue. So our test passes, but It's actually like the code has a bug So what can we do here? How about we take the API spec and we write some code to generate annotations from it And let's start with the model stuff. So like these these data objects that I showed you This is kind of like a class that mimics what I shown you in the open API spec and we see review rating as a string We want people to have that generated like we don't want people to type that So let's use that. This is the same function just repeating it for So that you know what I'm talking about when I show you the new test code and this new test code Now it doesn't use a mark, but uses this model class As a return value in our code and if we do that We actually get an error from my pipe because it knows that the business model that we have generated Review rating is of type float. So it is of type string. So I shouldn't assign a float But can we even go a step further so there's like very Dynamic client object that I've shown you here that basically like reads the spec and Generates these attributes and these methods Can we annotate that as well? That would be really cool because just these model classes Even if we ought to generate them people need to like manually you like add annotations To their functions. Yes, you can use them in tests like I've shown you but we also want like our non-test code to you know To have annotations and it's actually possible. So this is like a shortened version of What we have at Yelp. So basically You create like and all of this I should really reiterate that needs to be generated. So in the end You don't have to like manually do any of this So yeah for this business service we we have this like business service annotation It has like well in this case one attribute to business. This is the client or business now It has an annotation. It's a business resource and what is a business resource? Well, the business resource has this business info method So there we now know it gets one argument business ID and what the type is so if we call this thing the wrong way my pie will give you an error and Sorry, and it will return like a future that is defined below That is a concrete type like we pass it the model that is returning and so if we like we Automatically get all of these type annotations and my pie will complain if you do this network call incorrectly or if you use the data it returns incorrectly and All of this like modern ID is like pie charm even give you IntelliSense and and auto completion and all of that stuff so takeaways Annotate your code to improve documentation and catch box earlier fine grained Data structures type data structures really help understand the data flow of your application Reduce the number of tests you have to write especially make the test you do write more valuable And you can use generated annotations to type check communication across network boundaries so check us out on Social media especially the engineering blog we are hiring Please come talk to us at the booth and you can find the slides over there. Thanks a lot