 Okay, hello everybody. Let's start with a very small example. So I have this tiny program. Only three lines seems pretty straightforward. But let's see if we can type check it using MyPy. So I guess first you need to install MyPy, just use to install MyPy. And now you need Python 3 to run MyPy. It's a Python 3 application, but you can use MyPy to type check also Python 2 code. Which we'll talk about later. And then the simplest way to type check and go, which is run MyPy, name of the file. And actually turns out that this program has an error. So reverse returns an iterator and you can't append to an iterator. So great, MyPy find a bug. Nice. So what exactly is MyPy? It is a static type checker. So static in this context means that it analyzes your code, like tries to understand some aspects of the code, but it doesn't actually run the code. So it has no idea like what is exactly happening at runtime. So it tries to find type errors, things like missing attribute, calling a function or method with a wrong number of arguments, or wrong name for a given argument, or wrong argument type, and things like that. And it needs a bit of help from the programmer. So you need to add type annotations to your code to actually have MyPy do useful type checking. So the type annotations are part of, like Python, that was mentioned briefly. And like Python 3.5 comes with a typing module in the standard library, which kind of we are going to use later on. So okay, now I'm just going to give you a very brief overview of how you like annotate your code. So I'm going into, there's like a large number of different kinds of types. You can use it's pretty flexible, but you can start it with just like a couple of basic things and then gradually kind of learn more stuff as you go on. So here's a slightly more complicated example. So here we have a user-defined class cat, which has, and this has type annotations. So here we have a Python 3 type annotation for the DunderInit method. So the argument color has type str. It just basically reference to the built-in str class object. And then also DunderInit doesn't return a value. So we have a return site. There's like arrow 9, which means that it doesn't return anything. Well, implicitly it returns 9, but we don't really mostly care about that. Another slightly interesting thing is like the self argument isn't annotated. MyPy is clever enough to figure out that the type of self is cat. So there's no need to kind of explicitly kind of add that sort of redundant information. And then the body seems kind of straightforward. There's actually something interesting going on there. So MyPy, by looking at the body of DunderInit, MyPy figures out oh, the class cat has an attribute called color. And also it does type inference. So it looks at the initializer, which is just an expression color. And the type of that is string. So it like infers at the type of the color attribute is also string. So you don't need to annotate that explicitly. And usually like you don't need to annotate variables, just functions. And then I mentioned the typing module. This is in the standard library and there's a back port for Python 3.4 if somebody is still using it on Python 2.7. And it is a bunch of like types and utilities that are very helpful. And almost all non-trivial code using annotations will use some of those. So in this case we import the types to iterable and sets. And we use them in the annotation for the function all colors. So the cat argument is an iterable over cats. So we use square brackets to kind of build more complex types out of like kind of simple types. And also there's return type use the square brackets to say this is like a concrete set object with string items. And actually if you run mypy against this program it finds another error, nice. Like we actually misspelled the color attribute. So obviously it should be written like that. And now it's green. And again mypy needs to know the type of cats to kind of know that the color attribute actually is misspelled because otherwise you have no idea of saying like what is supposed to be inside cats. Okay, another example. So in this case we just have like a task which is an empty list. And actually like mypy kind of complains if you write a type to this program because like it doesn't have enough information to decide what the type of the list item is. So you need to help mypy a little in this case. It's straightforward enough. So okay it just import another like type from typing and then we add a type annotation. So this is like variable type annotation which is what's added in Python 3.6 which is pretty nice. And you can also leave out the initializer if you just want to declare the type of a variable. So this kind of helps mypy a bit. But what if we actually are not sure what the item type is. Maybe there's some legacy code that nobody understands anymore. So in that case in general the type that you can't kind of figure out you can always fall back to any which means basically unknown something. So mypy kind of lets you put anything within that list. So it's kind of a cost. It can't type check as much but then it gives you more flexibility. So you can use this but you should use it too much. Okay I mentioned that like you can also type check Python 2. So these obviously you have to use a slightly different syntax. In this case this is also standardized for your type comments. You have different type comments for functions and variables. And the variable type comment is also useful in Python 3.5 which doesn't support the variable annotation syntax. And again like there's a back port for typing for Python 2.0 so you have to like pip install that and then you can basically do all the same stuff except with a slightly different syntax or quite different syntax. And if you type check my using mypy then you should use the dash-dashpy2 option to run mypy in Python 2.0 mode. And you also have Python 2.0 and 3.0 directly in code and just run mypy twice with different options. And note about Python 3.7 So Python 3.7 added some pretty cool stuff related to typing. So you should really consider creating to Python 3.7 if you're still using something else. In particular I like that from future import annotations thing which might not sound like much but actually makes life much easier. So it lets you have forward references in your types like these just like as you'd expect. So label is defined after the like label attribute. In Python 3.6 you have to use like string literal around the label in the annotation which is kind of ugly. So just jump straight to Python 3.7 and you can just forget about that problem. Okay, so now you have a general idea of how types work and what mypy is but should you use types? What's the point? The main thing that we've heard like over and over again from users and like other people giving talks about mypy it's just like type annotations make code much easier to understand because they act as machine checked, machine very write documentation and this is something particularly important for large scale basis if you have like a team of multiple programmers working in the same goto you can't expect everybody to understand everything and these types really make code understanding much easier and obviously types can find bugs early so basically you should run mypy like after like maybe each change you make not even trivial ones to make sure that you didn't have like typo something so because it's like the fast you find bugs the easier they are to fix because you just remember the code better so like few bugs in production which is obviously another big benefits. There's also like a slightly less obvious thing by increasing readability people are less likely to make mistakes by misunderstanding the code so like calling a function with say like wrong argument that just happens to run due to talk typing but it might produce something completely bogus so type checking kind of kind of catch those sorts of issues so again like few bugs thanks to typing and both of those kind of mean that you spend less time reading code less time like fixing and debugging issues in production which means like more productivity and more time to do actually useful work productive work and there are a couple of other big things that are kind of slightly less obvious is like ideas like by chairman or studio code can take advantage of type annotations and they can like give you better like code completion in pie trim like go to definition etc work better more reliably like especially if you have a large code base like the automatic type analyzers don't work that well but if you have type annotations then kind of it scales much better and works more reliably and also like types make refactoring easier so generally you can modify the signature function generally you just modify the annotation and then run my pie and it will show exactly which part of the code base needs to be updated not like you get some weird exception in your test or maybe in production because you forgot to update something so this is really like improves productivity and also it makes easier to refacture so that's your more likely to refacture code which means that your code is cleaner and you are less likely to introduce bugs so cleaner code is also less buggy code so all of these benefits are kind of they might be one to be obvious from the beginning but if you actually start using my types soon you'll be kind of getting the benefits okay another thing that my pie lets you do is gradual typing so most users as my pie I'd expect started with an existing code base without any type annotations the idea is that you can start with annotations and then start gradually add them until you reach like reasonable level of coverage you don't need to go like 300% but this is a great thing because it makes migrating easier but there are some issues you should be aware of so this is another like really tiny example we have like two modules A and B and B calls function F in A and obviously like well it will blow up add an int to a string so what if you run my pie actually my pie says like okay I can't see any problem why is this well the thing is my pie doesn't type check functions without annotations this is to make the gradual typing easier so if you have like huge numbers of legacy code then often my pie will generate some kind of warnings about things that it kind of can't figure out and you might get like hundreds of errors like types like everything in one so basically the idea is to gradually add annotation coverage which means like if you don't have annotations then there's not much my pie can do so basically okay let's add an annotation but again like what there's still no error well the thing is my pie still doesn't know what F returns because it doesn't have an annotation so it stills like F returns something maybe you can add it to string so now when you actually also annotate F my pie can find the error so basically it means like so if it's just an editing code you might need to like other code that the code interacts with to get like actually useful type checking results so if you have few annotations you get some checking but not that much if you have more annotations then more checking so that's kind of the thing to remember it kind of like and that's kind of something why can't my pie catch this error so this is really important thing to understand okay now I assume that you have some existing code that you'd like to migrate to static typing and you haven't used my pie before so no annotations so the first step by recommend is to try to get my pie to run against some code without adding any annotations other than the minimal ones to get kind of clean my pie run so again like as I said if you want to tweak things a bit maybe you do some like dynamic stuff that my pie doesn't quite understand or something else like you need to add some annotations so it takes a bit of work I'd recommend starting with maybe like 5,000 to 20,000 lines of code that's really like a reasonable size so if you have like million lines of code you shouldn't kind of try to get that all running under my pie so maybe like one or two days of effort and you should get like lead run for that then great then you're kind of doing well but things like how do you kind of pick that like 20,000 lines of code that's kind of can be a bit tricky so I'm going to spend some time talking about that so I guess the simplest thing is to any type check specific directories or specific files just like my pie and some paths which is great it works that's what you'd expect but obviously if you have like different directories you don't want to enumerate them explicitly so it doesn't really scale to like really large code phases and this is a slightly more general thing you can do is kind of similar so you can have like regular expressions or like a lot of patterns which kind of use to kind of pick up which files not to type check or maybe so basically you use a black list or certainly you can use a white list which are like have patterns which match the files that you want because in this case like we don't want to type check test code so we have this like slash test slash pardon so you probably wouldn't want to use exactly this but something similar might work for you but this kind of is more scalable than previous one it still kind of doesn't scale to like tens of hundreds of files probably okay this is what we use at draw box so we have like tens of thousands of files we don't want to maintain like manual lists we also have like black lists actually we probably use all of these in some capacity but we also have this like type comment kind of detection script which kind of looks for files with type comments and then we include all those files in the build so the moment you start underneath file it gets automatically picked up by the mypy build in Python 3 you might have a like regular expression which looks for like type annotations or import from the typing module or you can invent your own kind of marker so that way you can kind of pretty easily add new files to the build without having to modify some configuration file so you have to do a bit of scripting but it should be pretty straight forward again like this is a problem that maybe you don't have that like type annotation in a file because it doesn't have a function you forget to add the tag then you're kind of like problem big then mypy doesn't see that file so these are kind of tradeoffs you need to think about this carefully because it's kind of hard to change these like like six months into the project and if you have really lots of code then mypy does follow imports option which kind of decides like what should mypy do with files that are imported from the files you pass it on the command line but that are imported from those files but aren't included in the command line so skip means just like okay only type check these files on the command line there's also a silent option which kind of presents those files but it doesn't generate errors these kind of useful because otherwise you might get like large number of errors from mypy on your first run so this is like good thing to try okay next time I'm going to take something that's kind of directly related the previous topic but you'll hit it pretty quickly when you're like annotating any real code so mypy uses stub files to describe types in library modules there are stubs for the standard library and a bunch of third party packages mypy ships with stub so if you only use the standard library then some common third party modules it might just work but it seems mypy complains if you can't find a stub because then you can't type check your code and it doesn't want to silently kind of start ignoring stuff you can recognize stubs by the .py extension again this is like standard so what if you are using a third party library and it's missing a stub bundled with mypy so one thing is you can look for like an external maybe somebody has contributed a stub you can like google there's a recent Beb which lets you install stubs using pip install which is nice but unfortunately it's kind of new thing so there aren't that many stub packages available but if you write new stubs you should consider contributing it to the community next thing is you can just use a type ignore comment which means don't report any errors on this line so you get okay we don't ship with SQL alchemy stubs but it's not a big problem just like type ignore problem it makes kind of explicit that mypy can't type check your uses SQL alchemy but it can deal a lot of type checking for other stuff so this is like if you only have a few errors this is like a nice easy way to do it and it was for other kinds of like complains that you can't figure out just ignore it initially and then you can commit back to it later and see if you can fix it in a kind of more better way what if you have like a lot of like 15 bots to SQL alchemy in that case you can create a mypy.ini configuration file and you can just sell a mypy to ignore all the missing imports targeting these packages like maybe Bodo SQL alchemy and then those go away but if you import from something else you get an error so it's kind of like you kind of like accidentally like paper over like a large number of errors only those things that you don't want mypy to report okay now that you have a clean run from mypy then you have a way of running mypy which against some subset of your code base or all of your code base depending on the size you should like commit that script into your repo and have everybody in your team use that to run mypy because you want consistent results this is again should be pretty clear but it's important things to remember and once you have the script then you can run the script in CI to make sure that nobody will introduce additional errors accidentally or at least you will catch them pretty soon the fourth step is going back to the like the modules A and B you should add annotations to commonly imported library modules because otherwise like these modules are used all over the place and each time you call something in that module mypy thinks it's I don't know what it's any and it can't do much type checking with the result so pretty early on you should annotate a bunch of library functions and classes this will greatly increase the coverage and actually mypy will catch more errors pretty soon and then you should kind of talk to your team and kind of establish guidelines it's like if you edit a file annotations to the modified functions if you write any new functions you should annotate them so this way you'll gradually get more and more annotations in your code and this because when you're writing code you should understand what it's going so writing the annotations should be pretty straightforward so it's actually you'd have to write a doc string anyway so this actually saves time because annotations are more compact so you save some time and you get like more type checking coverage so it saves time so it's a win-win and then at some point you should probably look at your legacy code and try to start underlining that because it will kind of leak some like imprecise types to the other parts of your code base so you can do it manually it works pretty well, it's kind of boring but it's not that much work or there are tools like monkey type and pike annotate which let you collect types so for example you can run tests they can collect actual runtime types and then they can generate draft annotations to your code they aren't 100% correct but often like you just slightly manually tweak it then you can actually get a lot of code annotated without too much effort but again like you should give it a try they may work but sometimes they have some rough edges okay so maybe you've been adding a bunch of files to your build maybe you have like this means like mypy runs might be a bit slower because it's doing more work so it's useful but obviously faster is better so a couple of things pretty recent mypy features that can speed up mypy runs by maybe 10x or even more there's a mypy daemon which keeps like the program state in memory so that like the incremental builds it'll only have to like reprocess the modified files that's a really nice thing and it goes to something like remote caching to basically download like a recent snapshot of the mypy internal state so that you can then only have to do an incremental build on top of the most recent incremental state instead of like processing the inside code base every time you say switch to another branch so this is a summary of the steps we should kind of get you started then obviously there's more advanced topics and you just create the docs afterwards and see whether there's like other stuff you want to experiment with and finally I'll talk about a few mistakes beginners might make and I want to kind of avoid these the first mistake is like I'll just start with this single file which I'm working on and I annotated with mypy, I use mypy to type check it and then I get a good idea of how mypy works this is not true because I said that you need to annotate the library modules and other modules your code interacts with otherwise it might look like mypy doesn't check anything because you haven't annotated that much so this is not a good way to get started another mistake might be like oh we just like write annotations or new code that we've write so in like 12 months we have like a lot of our code annotated again like this misses all the legacy code that you probably have a lot of code that doesn't get changed in 12 months and that will never get annotated you should like annotate the legacy code as well outside the normal kind of coding workflow and then my mistake 3 is like oh we have this million lines let's type check everything and then you might get like I don't know 500,000 errors that you need to edit oh I'm not I give up it's too much work yeah you should start with like maybe 20,000 lines to go down then gradually increase it so there's no like big bang sort of integration which is pretty demoralising and if things go wrong then you wasted a lot of time so I think it starts simple and quickly I'm going to talk about our experiences of using MyPie at Dropbox so we've been using MyPie since pretty early days in MyPie yes since 2016 we currently have over 2 million annotated lines and pretty much all teams using Python use MyPie at Dropbox and this kind of like organic growth people just like see it's useful for them so they start using it we've kind of improved MyPie on the way so to get like basically more teams on the board and currently we are using MyPie daemon which I briefly mentioned and we get like incremental runs we get like telemetry from users typical incremental run takes about 2 seconds even if you have like millions of lines of code so you can it scales pretty well 6 months ago so this is really improved the other is a remote cache that I mentioned so that like it's also like when you start run MyPie for the first time it's also much faster and then we have a PyCharm MyPie plugin which is open source you should look it up if you use PyCharm it's really a kind of makes easy to run MyPie from PyCharm and kind of jump to the other it's really kind of simplifies the workload it's really smooth so that's all I had thank you everybody thank you Yuka can you raise your hand thank you great talk my question is about stub files supposing I got a bunch of C++ code that's exposed to Python do you know of any way of being able to generate those stub files from the C++ binary MyPie ships with a tool called stubgen and it can do some like runtime introspection of C extension modules and it will try to generate some draft files so you can give it a try the thing is it's kind of impossible to do that like 100% reliably so basically if it's simple enough it may work if not then it doesn't work so basically try it out just a quick question so does it support Python 2.7 at all Python 2.7 is fully supported as a type checking target so you can use the type comment syntax you just run MyPie-Pie2 that's actually what we use at drawbox so it's really like production quality as much as Python 3 but you can't run MyPie using Python 2.7 but that's no problem you can probably install MyPie3 will Guido be angry if Python becomes statically type language Guido is totally behind and he's actually working on MyPie so yeah it's not like Python is not becoming a silly type language this isn't totally optional so it's basically mostly for large projects and large code bases it's really helpful so this is just one tool in the tool box so use it if you think it's useful if you only have like small projects that you work on then probably it doesn't make sense to learn it it might still be useful but it might not be worth the kind of investment in kind of learning this stuff is there some fundamental problem with creating typesets for psychopg because it's like a very popular project and it still shows us cannot import stuff from psychopg2 do you know something about that? I'm not sure about this particular library it might be just nobody has kind of contributed them because if it's sometimes it's so simple to write the stubs yourself everybody writes their own set of stubs and then nobody contributes them back so I recommend trying it out there's a subgen utility it's not kind of well documented but it's there so I recommend trying it out often it's kind of straightforward until it's a really big library like Django then that's actually I wouldn't recommend starting out with that because it can take a lot of work do you have any metrics of what kind of quality improvements you get or how much do you speed up this is something it's really hard to have hard metrics but from draw books periodically somebody said oh we had an outage or some kind of production issue that would have been prevented even the code would have been type annotated so it definitely but things like if you prevent something then that's how do you know because it didn't happen but basically people we get feedback from users they're really happy so it's like a subjective thing but when working large code base we're pretty convinced that it's mostly a win once you get the initial some level of type-taking coverage then it gets really helpful but initially you have to do a bit of investment if you just added one single file it's not that helpful thank you we have time for the last good question how do you imagine some library a library wants to support multiple other dependencies and one of them we expect one something to some type and the other one and the updated one we expect another type how do you deal with such things can you repeat the question I didn't quite hear imagine that our library depends on some other library and from one version in that other library the type changed so if you have a third-party library and the signatures change so typesheet can only have one version at a time but we support like pip installable stop packages so you can have multiple versions of the stop package corresponding to different versions of the library and then you just pin the correct version of the stop package and you get type checking, correct type checking so that's probably the best option thanks I admire your courage to write such a huge library alone okay fantastic talk thank you Jukka, let's give a hand to Jukka