 So I'm going to try to break a personal record in how fast I can go through these slides. So type hints. The last time I talked about this subject a few months ago, it was a pep that was a proposal. Now it's being accepted. So that's good news. But I still want to start with really ancient history. In 2015 years ago, there was a type-sick that was already discussing some way of adding type hints or static, at the time we called optional static typing to Python as a completely optional way to sort of communicate either between developers or between the developer and the compiler about the types of arguments of functions. And we actually sort of, one of the proposals, which I at the time already favored, looks exactly like the annotation proposal that eventually got accepted. But at the time it was too controversial. I picked it up again in 2004, 2005, there was a series of, again, incredibly controversial blog posts that sort of, the sky was falling, people hated it, but at the same time, my own thinking about the topjack sort of slowly continued and I started introducing things like generic functions and generic types. Not too much, not too long after that, I realized that the whole topic of defining types for arguments was too controversial to actually introduce in Python 3 as such. But as a compromise, we got PEP 3107, which was accepted, which introduced the function annotation syntax with little or no semantics. The annotations would be introspectible, but otherwise they would be entirely ignored. And this was very much a compromise approach intended so that eventually experiments could be carried out like what has happened more recently. Because the recent history is that a few years ago at PyCon in Santa Clara, I met an enterprising young student who was writing, at the time he was writing, I think, a dialect of Python that he wanted to have gradual typing. And I convinced him that if he created the dialect of Python, his language would be great and it would have one user. On the other hand, I said if you actually tweak your syntax and add some compromises and sort of mess around a little bit so that it fits in with the existing PEP 3107 syntax, then maybe your work will not just be to earn you a doctorate, but will also be useful for the Python community. And he actually took that to heart and started experimenting with notations like list square brackets of T. I also ended up working with him at Dropbox. However, my pie still didn't seem to be going anywhere until last summer at EuroPython, Bob Ippolito gave a talk on what Python can learn from Haskell. And he had three very specific proposals, two of which to me seemed completely inactionable and the third one of which was we should adopt my pie. So that sort of inspired me and a few other people including Lukas Lange who drafted the first version of a PEP for type hints. And this again was an incredibly controversial, explosive discussion on Python ideas and afterwards on Python Dev and on IRC and everywhere else where I didn't want to look. But finally sort of in my head at least it began to gel what this was good for and I met someone who had been thinking about this kind of stuff for a while, Jeremy Siek who I'll be mentioning a little later. So the architecture that we eventually agreed on and I think my pie was very instrumental here, static type checking is not a function of the Python interpreter. And this is the sort of big light bulb that went on in my head and in other people's head, even for Yucca I took a while for this to sort of gel. The first version of my pie that supported Python would actually just run your program and just before running it it would type check it. And then he added a command line option to only type check it. And then eventually we removed the command line and removed the option to actually run the code. And run it, use the C Python interpreter or my pie or whatever, my pie pie or whatever. On the other hand if you want your code to be type check that's a separate thing just like pie lint is a separate thing it doesn't slow you down at execution time at all. And this architecture suddenly a lot of things started making sense and this helped sort of decide how to design all little details of the static language. The second part of the architecture is sort of obvious we're using function annotations for these type hints. You put them in your code, only the type checker cares about them. The third thing that was really important that was sort of also a somewhat smaller light bulb is you have to have another way of placing type hints separately so that the type hints are separated from the code that they annotate. We call these stub files. You could think of them as header files but they're not really the same thing because they're not actually ever used at execution time they're only used by the type checker. Before I go into more detail about all these things why do you actually want a static type checker and this sort of there have been many reasons why people have proposed optional static typing for Python and some of those reasons were very runtime oriented. People were hoping that at runtime they could catch functions being called with the wrong argument or people have hoped that at runtime a just-in-time compiler could generate better code or maybe sort of a module import type annotations could be used to generate more efficient code. This is an idea that for example Siphon actually uses albeit with a slightly different notation. However the real reason why static typing is an important thing is that it is not that it makes your code run faster because that's an incredibly complicated thing it is that it helps you find bugs sooner and the larger your project the more you need things like this and in fact people who have really large old code bases maintained by dozens or hundreds of or thousands of engineers are already within their organization running various things that are in some sense static type checkers. There is an additional thing that especially inline type hints help you help when you have a large team working on a large code base which is that new engineers are really helped by seeing the type hints and it helps them understand the code. And it's sort of it's in part it's just a communication mechanism from programmer to programmer which in general is always one of the criteria I use for designing parts of Python. Let's see. So the type hints in particular help a type checker. Python is such an incredibly dynamic language. There are so many clever hacks where you introspect a dictionary or a module or a class or use a dynamic attribute getter that very quickly if you do traditional sort of program symbolic execution of a program trying to figure out what the types of an argument are so that you can then check that that argument is used consistently with the argument type. Well you can't even find where the call sites are because everything is dynamic and there might be four different functions named keys and you don't you can't actually tell which one is being called very easily. Type hints help a static type checker sort of get over those humps. There's a little statistic that the authors of PyCharm told me. They, PyCharm is an IDE that has its own sort of partial type inferencing for Python programs so that they can show you not just when you're making syntax errors but also when you're calling things that don't exist or with the wrong number of arguments and they can make decent suggestions about what methods starting with K might occur at a particular point. So they told me that they can correctly infer the type of maybe 50 or 60% of all expressions in a Python program which means that almost half the time they don't know the type of an expression which makes it impossible for them to then give any useful hints or do any checking. In the case of an IDE of course what they have to do in that case is be silent or use some other fallback heuristic to give suggestions not say your program is wrong. But nevertheless if there were type hints in a program they could often produce more accurate predictions and so on. I did mention the additional documentation. You find coding conventions at companies that say in the doc string every argument must be described and the type of every argument must be indicated. Well if the type of the argument is already part of the syntax you save a little space in the doc string. Also if you don't have a doc string at all a document generator can still use the annotations to generate better documentation. So why do we need these stop files? Why do we need to be able to put the annotations elsewhere? Well the first use case that you think of very quickly is C extensions. When you start thinking about static typing anything in Python you realize that there is a huge number of built-in functions and built-in modules for which you also need to have type information. You can't easily scan the C code and then figure out what the types of all those functions and classes are. So you need to have some dummy Python code that declares the types for your corresponding built-ins and built-in modules. So this is the first use case for stop files. The second use case and sort of there's a series of use cases that have to do with Python code that you might want to annotate but there are reasons not to put the annotations in the code and so it could be that this is just third-party code and you can stick annotations in third-party code but now you have made a local mod and every time you upgrade that third-party package you have to do that again or that's a lot of work. You can't always push those changes to the third-party because they might not care, they might not be a maintainer, you might be using an old release that doesn't get maintained anymore, maybe they want to be source compatible with Python 2 and the annotations syntax only exists in Python 3 and so on and so forth. Also there are too many things to sort of try and annotate everything. So stop files are a lighter-weight approach to annotating code that for some reason you don't want to annotate in place. So when I present all these ideas I still get a lot of very sort of critical negative looks. A lot of people really like the fact that Python is dynamic and they don't see any reason why they would pollute their code with stuff that in their mind is associated with troglodyte languages like Java or C++. Well, and nevertheless the people who are maintaining very large code bases often have some form of static analysis. They have things that look in the doc strings and use some convention for storing types in doc strings and use that in their analysis. Or they have some kind of static analysis but they don't have annotations at all not in doc strings nor anywhere else and their type checker just isn't very effective. PyLint can only catch so much. So in some sense what this whole proposal is actually introducing is more or less just a standard notation that you can use in case you already want this. It's very much optional. In Python 3.5 the first version where it's available it's also provisional which is a technical term for new standard library modules and new peps in general where we say well we introduced this in the standard library but we're reserving the right to sort of change the API for one full Python release. So in Python 3.6 the typing module may look a little different perhaps it's unlikely but it could even look quite different than it looks in 3.5 and this is something that sort of falls outside the normal guarantees of backwards compatibility you can read up on this in PEP 4.11 which sort of explains and defines the concept. The key thing is that in 3.5 nobody's code will break and my plan is that beyond that we won't break your code either but at the same time I do want to sort of take a position. I don't want to say well we have the annotation syntax without semantics let people just do whatever they want to do they can use mypi if they want to they can use their own docstring based convention they can put type annotations in decorators let let the billion flowers bloom. I think that we've had enough experiments and sort of attempt at doing this that it's better to get everyone behind one proposal and I was very pleased to see that Google and PyCharm for example were both very supportive of this proposal even though they're not planning to adopt mypi itself but they are planning to adopt this new syntax. Some people said well okay maybe you're right maybe we need a syntax but you can't sort of force it down our throat it's it's unripe immature needs to be thought about more let's wait until 3.6 but really that's not gonna help anybody if you want a notation that uses angular brackets instead of square brackets introducing that is just as hard in 3.6 as it's going to be in 3.5 so I sort of I mean I started this with what I thought was plenty of lead time we had a large number of very productive discussion threads and I just pushed on everything to to sort of reach a compromise and get something working and so if you were hoping to use this for code generation or if you still believe that type annotations mostly are useful to make your code faster sorry that's not actually very high on my list of use cases. PyPy is doing fine without type hints we'll see what Siphon says. Siphon I believe can already optionally use a notation syntax instead of the traditional Siphon notation maybe they'll prove me wrong but Cpython certainly is not going to suddenly run your code faster if you put annotations in and that is not at all part of the plan so there's one more thing PEP 3107 is now hmm it's not quite ten years old maybe it's eight years old there are definitely people who have used annotations creatively and done something completely different with them. Here's an example of something I made this up but I saw something similar where someone had written little little language for marking up functions that would be invocable from some command line where the annotations specified say the option name used that's cute that's not going to break in Python 3.5 however if you if you run code like that with mypy in order to type check it mypy is going to choke on that particular notation because mypy expects the annotations to be something else. Of course you may not need to run mypy you may not care at all or if other parts of your code you actually do want to benefit from type checks and you you you think you want to run mypy but you still want to use this particular notation in some part of your code there's actually a decorator defined in the in the PEP that you can use to shut it up and it basically tells mypy this function or you can also use it as a class decorator this class ignore the annotations because they mean some they're meant for someone else. Okay so that was mostly an apology a history sort of the sort of the motivational part of the talk now I'm going to try and outline a bit how this actually works how do you think about type hints if you really want to know you should probably start with that 483 which is sort of a simplified theory behind this stuff but let me go over a few of the basics here's a very simple function named greeting it has an argument with a type and it returns a type happens to be both our strings then there's a function greets that calls the function greeting greet does not use annotations greet is not type checked the basic idea of gradual typing is that both functions can occur in the same program even in the same module and a type checker is required to accept that code if inside the greeting function there was some use of the name argument in a way that is incompatible with it being a string type checker will complain about however in the greet function where there are no annotations to be seen if you invoke greeting it's not going to perhaps the biggest thing to sort of understand is if I could only get a mouse okay well you can see dev greet of name clearly name could be anything print greeting of name the greeting function only accepts a string however we're not going to get complaints from the type checker that we don't know for sure that name is a string in this greet function and that is sort of that in case of doubt don't complain that is one of the basics of gradual typing and that's that's sort of different from for example in if we were to assume that name given that it has no annotation has the type object then we would actually have a type violation in this code because greeting doesn't take all objects an object could be a list and a list is definitely not acceptable for greeting at least it's not the string so instead of being sort of picky a good type checker using type hints sort of checks thoroughly checks code that has annotations and backs away from code that doesn't and lets the two be combined in a useful way also if the annotated code calls something that is unannotated it will always just assume that the best possible thing will happen there so this is sort of the principle I think I'm repeating myself here which is unfortunate because that means less time for for questions code without annotations is always okay to the type checker there are some hand-wavy things here because there is some subtle subtle differences but basically there is this magical type named any which is different from the also somewhat magical type named object and the absence of annotations in first approximation can be seen as annotate everything with the type any and any has a bunch of magic properties and I'll get to that here so any is confusingly both at the top and the bottom of the class hierarchy or that type hierarchy really on the one hand if you ask for any object X is it an instance of any and this this is of course a question that the type checker asks itself it's not a question that you ask at runtime although I use a runtime notation here to express it it's always true everything is an instance of any also everything that's a class is a subclass of any which really means it's a subtype apologies to mark on the other hand and and this is the weird weird part any is also a subclass of every other class and I'm going this is you can see I should not try to draw squirrels but I can draw a very simple diagram with boxes and lines with the help of help of PowerPoint this is a very simple class hierarchy it has object which is the built-in object is as number and sequence which happened to be abstract based classes it has none type which is the type of the variable none now let's add any so any is sort of a superclass of object it's even higher up in the type hierarchy but it's also at the very bottom and if you were to think of this in terms of a classic subclass a relationship everything becomes a mess because now you can prove that every class in this hierarchy is a subclass of every other class in this hierarchy which completely collapses everything to a big muddy ball of everything so we don't want that we want this version and there is a separate relationship which is formally called is consistent with that is just like the subclass relationship but special cases any on either the t1 or the t2 position and you either got this at this point or I'm going to ask you to look it up later actually Jeremy Seek has a very good blog post what is gradual typing so what do we have in our typing module typing dot pi it's a single pure Python module it's the only thing that the pep actually adds to the standard library very easy to ignore this is where you import things like any so again there's no new syntax syntactically we are constrained by the stuff that Python 3.4 or 3.2 even can already do and with a little clever operator overloading that's actually not such a terrible constraint we're not actually adding any type annotations to other parts of the standard library so if you're looking for examples of type hints you'll you're going to have to look elsewhere also this typing dot pi itself can also be installed in Python 3.2 or 3.3 or 3.4 using pip install what does the typing dot pi module do it defines a whole bunch of magic objects like any and union and dict and list with capital D and L that are used for expressing types so here is a little example class it's kind of messy there's a chart class and has a function set label and you can see that it's being annotated with some argument types I don't give the function bodies now there are also some plain functions make label and get labels are not part of the class they're plain functions and I just include them to show that you can use a class as a type annotation in some other part of the of your code I'm also showing here that you can use the built-in list type as the type at the bottom you have the variable the argument points which is a list and the function get labels returns list however that is incomplete information because we would like to be able to express to tell the type checker about these lists what are what is the type of the element of these lists and so there is a new notation using a capital list and a capital tuple which are just some more magic classes that you can import from the typing module and now we can say well let's look at the return type first the return type is list of string so it is written as capital list square brackets stir square bracket close you can also combine more complicated types we can have a tuple of a float and a float which is a tuple of length to each of which item has type float and you can use that as the argument of a list type so now we know exactly what the type of that points argument is and we know exactly what the return value is you can go one step further instead of list you can write abc's the typing module exports modified versions of the standard collection abc's like iterable and you can actually say the argument can be any iterable of tuples of float and float however we still keep the return type this is pretty idiomatic type hinting the return type is a concrete list because we actually sort of declare that it returns a list and not some other sequence so what exactly happened there typing dot iterable is almost just an alias for collections of abc dot iterable however it has a little bit of magic behavior added to it but it is still usable as a standard abc it's it's usable in all the contexts where collections of abc dot iterable is usable but it is also a type the typing dot list type shadows the built-in lowercase list and tuple has some resemblance resemblance to the built-ins tuple however it's not an immutable sequence it's more like a structure I have been incredibly imprecise in my terminology technically we should talk about types when we talk about things that the type checker cares about and classes when we talk about things that happen at runtime the reason that most of the time things work out fine if you're fuzzy about the distinction is that all classes are usable as types when you define a class that class is always also usable as a type however there are a few magic things that are considered types like any union that aren't classes so in the very little time I've got left if we want to have any q&a a complete enumeration of things that can be used as type hints so anything that's a class can be used as a type hint there are these generic types list of int there are the magic things that I haven't all explained yet although I've given enough of an explanation of any you could also define your own generic types the first thing that I haven't mentioned yet with which is pretty standard in type theory is a union type you could easily have a function that takes either strings or numbers as argument and you might use a union like that a very common special case of unions is an argument that is either a certain type or it's none and we can express that using optional optional doesn't necessarily save you any characters to type but it certainly gives a very clear intention to the human reader the type checker actually just expands it to union of int and none so tuple I already sort of tried to explain how tuple works it really is a structure with a fixed number of fields each with their given type it's sometimes called a cartesian product if you read academic papers for those people who use tuples as immutable sequences you can say a tuple of some type and then dot dot dot three literal dots ellipses that's actually a mutable sequence of floats of arbitrary length callable sometimes you want to say an argument is a function that takes such and such arguments we have a notation for that it's not a very elegant notation but given all our constraints it's the best we can do if you have a really complicated argument signature you can just put an ellipsis there and then it will take anything and then at least you can at least you can still talk about the return type generic classes I'm going to cut this short but you define these by deriving from a special thing named generic using a type variable type variables have to be defined explicitly using the type far helper function the collection ABCs like sequence themselves are all generic and can be used in this way automatically you could also define generic functions again you introduce a type variable type variables can be used like you can use if you only ever need one type variable in a particular module you can just use t everywhere you don't have to define a new type variable for each function this is something I'm going to skip in favor of more question time there is a built-in type variable that can express something that is either a string or bytes which is a very important idea in python 3 mostly for python 2 backwards compatibility but there we have it oh yeah now we get into the sort of slightly ugly stuff sometimes you have to have an annotation that contains forward reference it needs to there's an argument but the class that is used as the argument type hasn't been defined yet and sort of one common example is recursive types you can put the whole annotation in string quotes and then the type checker will understand will sort of evaluate that while c python just sees it as a string there are also some cases where you want to annotate variables especially class variables that are used as instance variable defaults it's this is very useful we have a type comment for that and there's also a cast function if you somehow need to tell the type checker everything's okay don't worry little guy so stub files have a pyi extension the bodies in the stub file contain literal literally three dots in stub files you can define overloading which is also something I'm going to skip explaining you can disable your type checks in probably too many different ways but this is to sort of make the people who don't like type hints or have other uses for the annotations function as happy as possible and then finally here's a list of alternative syntaxes that have been proposed at various times for and what we ended up on the left I'll actually skip this I do notice that nobody actually proposed a return type per end arguments for a callable the reason that we ended up with the somewhat clunky syntax that's actually in the pep is that it needs to be easy to parse we don't want to introduce any new syntax because we want to be able to backport typing the pi to previous python versions at least 3.2 and up and we really don't want to have to change other standard library modules so if you're a type theoretical academic you're probably very unhappy with this proposal but we can iterate over the next few years and at least we have the first iteration in our hands rather than in the air the pep has been accepted thank you mark Shannon again the status is provisional the code is in 3.5 beta 1 and I'm very happy that much of the discussion is behind me so let's start some more discussion so we don't really have time for questions but we can make time for questions if the next speaker can come up on the stage now thank you so first question thanks I really like the idea of type hints I'm sure that will will help us write better or more high quality codes but I'm not so sure I like the idea of having two options for specifying these type hints so in a stop file or inside the source code itself that somehow doesn't seem very pythonic that there's two options to do one thing and I'm thinking I have also heard some comments from other people that say argument lists will become very long so the code will become harder to read would you perhaps recommend always using stop files as I can see that IDEs could perhaps inline these in the source file as you're working on it can I ask you to wait for the question to be finished if you want to live and so we can add answers that was a long question my position is that there are really quite a few downsides to stop files it's it's sort of difficult to switch back and forth between the stub and the main code and so when you're reading the code you on the one hand the argument lists become longer but if you put all the annotations in the doc string your doc string becomes longer and people are okay with that in many cases the annotations aren't actually so so verbose some of the examples I gave for example would are are impractical the in practice you would always use a type alias which I forgot to mention you can just say a is some type expression and then after that a is usable as a type alias and so using type aliases you can make your annotation shorter and also more meaningful so I think that the the case for inline annotations is still pretty strong at the same time there are absolutely cases where stubs are the only acceptable solution so we I I think that we have to have both over here I'm raising my hand um hi sorry where's the speaker yeah yeah give me so um to add what I'm I don't know what the proper term would be but they're effectively arguments to things like list or uh callable parameters sorry the parameters parameters parameters um in python we use parentheses to specify parameters to things um why did we make square brackets for these instead because usually the thing before the square bracket is a class and calling a class has already the meaning of instantiating uh the class to an instance also the the square brackets sort of make you wonder well what's going on here something interesting must be going on and sort of parameterizing types are something quite different than calling a function or instantiating a class so sort of the square brackets came out because they're notationally sort of they stand out a little bit and yet they are actually already part of existing python syntax because you can just use we we actually implement the square brackets by overloading get item on the on the meta class and that would be the last question of a I have two questions actually but yeah uh so the first question would be uh is there any way to express variants and contra variants yes great I didn't get to this but it is in the pep you can have variant and invariant covariant and covariant type contra variant type variables the default is invariant uh second quick question uh how do numeric types work like floats ins can I pass an into a float or that is currently done by a little bit of a special casing in the type checker so that if the specification says float and the actual value is int that's actually considered a subtype and acceptable well um thank you again Guider my pleasure my apologies